conftrace_

Geoffrey Irving

7 papers · 2016–2024 · 4 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓

+6 more ↓

🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🌍 Conference Polyglot (4) 🏃 Academic Marathon (8) 🐝 Cross-Pollinator (12)

🐣 Hot Topic Early Bird 🐝 Cross-Pollinator (12) 🌈 Renaissance Researcher (6) 🌟 Keyword Trendsetter Combo (3) 👥 Mega-Team (28) 📈 Trend Setter

Conferences

NIPS (3) ICML (2) EMNLP (1) OSDI (1)

Top co-authors

Trevor Cai (2) Roman Ring (2) Amelia Glaese (2) Saffron Huang (2) Ethan Perez (1) Josef Urban (1) Matthieu Devin (1) Shane Legg (1) Andy Davis (1) Po-Sen Huang (1)

Research topics

Models (1) Systems (1)

Keywords

harmful content (2) representation learning (1) reinforcement learning (1) imitation learning (1) preference learning (1) question answering (1) prompt engineering (1) toxicity detection (1) language model evaluation (1) document retrieval (1) responsible ai (1) distributed computing (1) distributed training (1) demonstration learning (1) reward learning (1) deep neural network (1) language model (1) safety evaluation (1) automated theorem proving (1) red teaming (1)

Papers

Scalable AI Safety via Doubly-Efficient Debate ICML 2024 Characteristics of Harmful Text: Towards Rigorous Benchmarking of Language Models NIPS 2022 Red Teaming Language Models with Language Models EMNLP 2022 Improving Language Models by Retrieving from Trillions of Tokens ICML 2022 Reward learning from human preferences and demonstrations in Atari NIPS 2018 TensorFlow: A System for Large-Scale Machine Learning OSDI 2016 DeepMath - Deep Sequence Models for Premise Selection NIPS 2016