Maarten Sap

79 papers · 2014–2026 · 11 conferences · across top CS/AI conferences

Achievements

+17 more ↓

🌍 Conference Polyglot (11) 🐣 Hot Topic Early Bird 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🏃 Academic Marathon (11)

🏃 Academic Marathon (11) 🧭 Keyword Pioneer 🐣 Hot Topic Early Bird 🌟 Keyword Trendsetter Combo (3) 🏠 Conference Loyalist (25) 🤝 Dynamic Duo (33) 🔬 Deep Specialist (15) 🧬 Topic Evolution 🏆 Keyword Champion (7) 🏆 Grand Slam ⚡ Prolific Year (11) ❓ The Questioner (7) 🗃️ Keyword Collector (318) 💎 Century Club (76) 📈 Trend Setter 🔥 Unstoppable (9) 🚀 Conference Pioneer

Conferences

ACL (26) EMNLP (25) NAACL (11) EACL (4) ICLR (3) AAAI (2) ICML (2) IJCNLP (2) NIPS (2) CONLL (1) CVPR (1)

Top co-authors

Yejin Choi (33) Xuhui Zhou (16) Noah A. Smith (12) Hannah Rashkin (8) Akhila Yerukola (8) Ximing Lu (7) Liwei Jiang (7) Ronan Le Bras (7) Jena D. Hwang (6) Swabha Swayamdipta (5)

Keywords

large language model (25) language model (11) text classification (9) commonsense reasoning (8) toxicity detection (7) natural language processing (6) ai safety (5) question answering (5) hate speech detection (5) text generation (4) social media analysis (4) toxic language detection (4) multi-agent system (4) sentiment analysis (3) bias mitigation (3) theory of mind (3) commonsense knowledge (3) natural language understanding (3) social intelligence (3) natural language inference (3)

Papers

Common Sense or Ableism? Rethinking Commonsense Reasoning Through the Lens of Disability EACL 2026 Social Story Frames: Contextual Reasoning about Narrative Intent and Reception ACL 2026 Out of Style: RAG’s Fragility to Linguistic Variation EACL 2026 Rejected Dialects: Biases Against African American Language in Reward Models NAACL 2025 SOTOPIA-S4: a user-friendly system for flexible, customizable, and large-scale social simulation NAACL 2025 AI-LieDar : Examine the Trade-off Between Utility and Truthfulness in LLM Agents NAACL 2025 REL-A.I.: An Interaction-Centered Approach To Measuring Human-LM Reliance NAACL 2025 NormAd: A Framework for Measuring the Cultural Adaptability of Large Language Models NAACL 2025 SafetyAnalyst: Interpretable, Transparent, and Steerable Safety Moderation for AI Behavior ICML 2025 On the Resilience of LLM-Based Multi-Agent Collaboration with Faulty Agents ICML 2025 SOCIAL SCAFFOLDS: A Generalization Framework for Social Understanding Tasks EMNLP 2025 Synthetic Socratic Debates: Examining Persona Effects on Moral Decision and Persuasion Dynamics EMNLP 2025 Words Like Knives: Backstory-Personalized Modeling and Detection of Violent Communication EMNLP 2025 BIG5-CHAT: Shaping LLM Personalities Through Training on Human-Grounded Data ACL 2025 Mind the Gesture: Evaluating AI Sensitivity to Culturally Offensive Non-Verbal Gestures ACL 2025 Mitigating Bias in RAG: Controlling the Embedder ACL 2025 Stereotype or Personalization? User Identity Biases Chatbot Recommendations ACL 2025 1-2-3 Check: Enhancing Contextual Privacy in LLM via Multi-Agent Reasoning ACL 2025 AutoPresent: Designing Structured Visuals from Scratch CVPR 2025 Let Them Down Easy! Contextual Effects of LLM Guardrails on User Perceptions and Preferences EMNLP 2025 Relying on the Unreliable: The Impact of Language Models’ Reluctance to Express Uncertainty ACL 2024 Where Do People Tell Stories Online? Story Detection Across Online Communities ACL 2024 SOTOPIA-π: Interactive Learning of Socially Intelligent Language Agents ACL 2024 Is the Pope Catholic? Yes, the Pope is Catholic. Generative Evaluation of Non-Literal Intent Resolution in LLMs ACL 2024 Clever Hans or Neural Theory of Mind? Stress Testing Social Reasoning in Large Language Models EACL 2024 Leftover Lunch: Advantage-based Offline Reinforcement Learning for Language Models ICLR 2024 Can LLMs Keep a Secret? Testing Privacy Implications of Language Models via Contextual Integrity Theory ICLR 2024 SOTOPIA: Interactive Evaluation for Social Intelligence in Language Agents ICLR 2024 WildTeaming at Scale: From In-the-Wild Jailbreaks to (Adversarially) Safer Language Models NIPS 2024 HEART-felt Narratives: Tracing Empathy and Narrative Style in Personal Stories with LLMs EMNLP 2024 Value Kaleidoscope: Engaging AI with Pluralistic Human Values, Rights, and Duties AAAI 2024 The Empirical Variability of Narrative Perceptions of Social Media Texts EMNLP 2024 Is this the real life? Is this just fantasy? The Misleading Success of Simulating Social Interactions With LLMs EMNLP 2024 SODA: Million-scale Dialogue Distillation with Social Commonsense Contextualization EMNLP 2023 NLPositionality: Characterizing Design Biases of Datasets and Models ACL 2023 From Dogwhistles to Bullhorns: Unveiling Coded Rhetoric with Language Models ACL 2023 Detoxifying Text with MaRCo: Controllable Revision with Experts and Anti-Experts ACL 2023 Riveter: Measuring Power and Social Dynamics Between Entities ACL 2023 COBRA Frames: Contextual Reasoning about Effects and Harms of Offensive Statements ACL 2023 BiasX: “Thinking Slow” in Toxic Content Moderation with Explanations of Implied Social Biases EMNLP 2023 Modeling Empathic Similarity in Personal Narratives EMNLP 2023 Don’t Take This Out of Context!: On the Need for Contextual Models and Evaluations for Stylistic Rewriting EMNLP 2023 FANToM: A Benchmark for Stress-testing Machine Theory of Mind in Interactions EMNLP 2023 Beyond Denouncing Hate: Strategies for Countering Implied Biases and Stereotypes in Language EMNLP 2023 ProsocialDialog: A Prosocial Backbone for Conversational Agents EMNLP 2022 When to Make Exceptions: Exploring Language Models as Accounts of Human Moral Judgment NIPS 2022 ToxiGen: A Large-Scale Machine-Generated Dataset for Adversarial and Implicit Hate Speech Detection ACL 2022 Misinfo Reaction Frames: Reasoning about Readers’ Reactions to News Headlines ACL 2022 Annotators with Attitudes: How Annotator Beliefs And Identities Bias Toxic Language Detection NAACL 2022 Aligning to Social Norms and Values in Interactive Narratives NAACL 2022 Uncovering Surprising Event Boundaries in Narratives NAACL 2022 Neural Theory-of-Mind? On the Limits of Social Intelligence in Large LMs EMNLP 2022 Crowdsourcing Beyond Annotation: Case Studies in Benchmark Data Collection EMNLP 2021 Just Say No: Analyzing the Stance of Neural Dialogue Generation in Offensive Contexts EMNLP 2021 DExperts: Decoding-Time Controlled Text Generation with Experts and Anti-Experts ACL 2021 Documenting Large Webtext Corpora: A Case Study on the Colossal Clean Crawled Corpus EMNLP 2021 Detoxifying Language Models Risks Marginalizing Minority Voices NAACL 2021 Challenges in Automated Debiasing for Toxic Language Detection EACL 2021 DExperts: Decoding-Time Controlled Text Generation with Experts and Anti-Experts IJCNLP 2021 Social Chemistry 101: Learning to Reason about Social and Moral Norms EMNLP 2020 Recollection versus Imagination: Exploring Human Memory and Cognition via Neural Language Models ACL 2020 RealToxicityPrompts: Evaluating Neural Toxic Degeneration in Language Models EMNLP 2020 Exploring the Effect of Author and Reader Identity in Online Story Writing: the STORIESINTHEWILD Corpus. ACL 2020 Commonsense Reasoning for Natural Language Processing ACL 2020 Social Bias Frames: Reasoning about Social and Power Implications of Language ACL 2020 PowerTransformer: Unsupervised Controllable Revision for Biased Language Correction EMNLP 2020 Social IQa: Commonsense Reasoning about Social Interactions IJCNLP 2019 COMET: Commonsense Transformers for Automatic Knowledge Graph Construction ACL 2019 ATOMIC: An Atlas of Machine Commonsense for If-Then Reasoning AAAI 2019 The Risk of Racial Bias in Hate Speech Detection ACL 2019 Social IQa: Commonsense Reasoning about Social Interactions EMNLP 2019 Sounding Board: A User-Centric and Content-Driven Social Chatbot NAACL 2018 Event2Mind: Commonsense Inference on Events, Intents, and Reactions ACL 2018 Modeling Naive Psychology of Characters in Simple Commonsense Stories ACL 2018 Connotation Frames of Power and Agency in Modern Films EMNLP 2017 The Effect of Different Writing Tasks on Linguistic Style: A Case Study of the ROC Story Cloze Task CONLL 2017 DLATK: Differential Language Analysis ToolKit EMNLP 2017 Extracting Human Temporal Orientation from Facebook Language NAACL 2015 Developing Age and Gender Predictive Lexica over Social Media EMNLP 2014