Ethan Perez

24 papers · 2018–2025 · 7 conferences · across top CS/AI conferences

Achievements

+12 more ↓

🏃 Academic Marathon (7) 🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🌍 Conference Polyglot (7) 🐝 Cross-Pollinator (6)

🌍 Conference Polyglot (7) 🏃 Academic Marathon (7) 🌈 Renaissance Researcher (8) 👥 Mega-Team (63) 👑 Triple Crown 🌱 Topic Pioneer 🧬 Topic Evolution 📈 Trend Setter 🚀 Conference Pioneer 💎 Century Club (24) 🗃️ Keyword Collector (85) 🔥 Unstoppable (8)

Conferences

ICLR (6) EMNLP (5) ACL (4) NIPS (4) ICML (3) ECCV (1) IJCNLP (1)

Top co-authors

Samuel R. Bowman (6) Douwe Kiela (6) Kyunghyun Cho (5) Mrinank Sharma (4) Tomasz Korbak (4) Akbir Khan (3) Nicholas Schiefer (3) Jason Weston (3) Ansh Radhakrishnan (3) Henry Sleight (3)

Keywords

language model (6) question answering (5) few-shot learning (3) semantic parsing (2) prompt engineering (2) red teaming (2) evidence selection (2) reinforcement learning from human feedback (2) large language model (2) machine reading comprehension (2) reinforcement learning (2) imitation learning (1) question decomposition (1) kl divergence (1) model selection (1) natural language processing (1) language model evaluation (1) bayesian inference (1) text representation (1) passage retrieval (1)

Papers

Language Models Learn to Mislead Humans via RLHF ICLR 2025 Failures to Find Transferable Image Jailbreaks Between Vision-Language Models ICLR 2025 Looking Inward: Language Models Can Learn About Themselves by Introspection ICLR 2025 Adaptive Deployment of Untrusted LLMs Reduces Distributed Threats ICLR 2025 Debating with More Persuasive LLMs Leads to More Truthful Answers ICML 2024 Vision-Language Models are Zero-Shot Reward Models for Reinforcement Learning ICLR 2024 Towards Understanding Sycophancy in Language Models ICLR 2024 Many-shot Jailbreaking NIPS 2024 Pretraining Language Models with Human Preferences ICML 2023 Language Models Don't Always Say What They Think: Unfaithful Explanations in Chain-of-Thought Prompting NIPS 2023 Few-shot Adaptation Works with UnpredicTable Data ACL 2023 Discovering Language Model Behaviors with Model-Written Evaluations ACL 2023 Red Teaming Language Models with Language Models EMNLP 2022 RL with KL penalties is better viewed as Bayesian inference EMNLP 2022 Single-Turn Debate Does Not Help Humans Answer Hard Reading-Comprehension Questions ACL 2022 Rissanen Data Analysis: Examining Dataset Characteristics via Description Length ICML 2021 True Few-Shot Learning with Language Models NIPS 2021 Case-based Reasoning for Natural Language Queries over Knowledge Bases EMNLP 2021 Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks NIPS 2020 Unsupervised Question Decomposition for Question Answering EMNLP 2020 Finding Generalizable Evidence by Learning to Convince Q&A Models EMNLP 2019 ELI5: Long Form Question Answering ACL 2019 Finding Generalizable Evidence by Learning to Convince Q&A Models IJCNLP 2019 Visual Reasoning with Multi-hop Feature Modulation ECCV 2018