Peter Hase

18 papers · 2020–2025 · 6 conferences · across top CS/AI conferences

Achievements

+9 more ↓

🏃 Academic Marathon (5) 🌍 Conference Polyglot (6) 🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🐝 Cross-Pollinator (12)

🌈 Renaissance Researcher (5) 🗺️ Taxonomy Completionist (35) 🌉 Interdisciplinary Bridge 🤝 Dynamic Duo (18) ❓ The Questioner (7) ⚡ Prolific Year (6) 🗃️ Keyword Collector (70) 🔥 Unstoppable (6) 💎 Century Club (18)

Conferences

NIPS (6) ACL (3) EMNLP (3) ICLR (3) EACL (2) NAACL (1)

Top co-authors

Mohit Bansal (18) Swarnadeep Saha (4) Elias Stengel-Eskin (3) Archiki Prasad (2) Harry Xie (2) Nazneen Rajani (2) Shiyue Zhang (2) Zhuofan Ying (2) Han Guo (1) Peter Clark (1)

Keywords

large language model (4) feature importance (3) preference optimization (2) in-context learning (2) knowledge editing (2) explanation generation (2) model simulatability (2) language model (2) question answering (2) natural language explanation (2) confidence calibration (1) visual question answering (1) domain generalization (1) explainable ai (1) prompt engineering (1) feature weighting (1) transfer learning (1) belief updating (1) out-of-distribution generalization (1) natural language generation (1)

Papers

System 1.x: Learning to Balance Fast and Slow Planning with Language Models ICLR 2025 Teaching Models to Balance Resisting and Accepting Persuasion NAACL 2025 LACIE: Listener-Aware Finetuning for Calibration in Large Language Models NIPS 2024 Can Sensitive Information Be Deleted From LLMs? Objectives for Defending Against Extraction Attacks ICLR 2024 The Unreasonable Effectiveness of Easy Training Data for Hard Tasks ACL 2024 Methods for Measuring, Updating, and Visualizing Factual Beliefs in Language Models EACL 2023 Does Localization Inform Editing? Surprising Differences in Causality-Based Localization vs. Knowledge Editing in Language Models NIPS 2023 Adaptive Contextual Perception: How To Generalize To New Backgrounds and Ambiguous Objects NIPS 2023 Can Language Models Teach? Teacher Explanations Improve Student Performance via Personalization NIPS 2023 GrIPS: Gradient-free, Edit-based Instruction Search for Prompting Large Language Models EACL 2023 Summarization Programs: Interpretable Abstractive Summarization with Neural Modular Trees ICLR 2023 VisFIS: Visual Feature Importance Supervision with Right-for-the-Right-Reason Objectives NIPS 2022 Are Hard Examples also Harder to Explain? A Study with Human and Model-Generated Explanations EMNLP 2022 When Can Models Learn From Explanations? A Formal Framework for Understanding the Roles of Explanation Data ACL 2022 FastIF: Scalable Influence Functions for Efficient Model Interpretation and Debugging EMNLP 2021 The Out-of-Distribution Problem in Explainability and Search Methods for Feature Importance Explanations NIPS 2021 Leakage-Adjusted Simulatability: Can Models Generate Non-Trivial Explanations of Their Behavior in Natural Language? EMNLP 2020 Evaluating Explainable AI: Which Algorithmic Explanations Help Users Predict Model Behavior? ACL 2020