Maarten Sap
79 papers · 2014–2026 · 11 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+17 more ↓ Show less ↑
π Conference Polyglot (11) π£ Hot Topic Early Bird π§ Keyword Pioneer π Interdisciplinary Bridge π Academic Marathon (11)
π
Academic Marathon
(11)
π§
Keyword Pioneer
π£
Hot Topic Early Bird
π
Keyword Trendsetter Combo
(3)
π
Conference Loyalist
(25)
π€
Dynamic Duo
(33)
π¬
Deep Specialist
(15)
π§¬
Topic Evolution
π
Keyword Champion
(7)
π
Grand Slam
β‘
Prolific Year
(11)
β
The Questioner
(7)
ποΈ
Keyword Collector
(318)
π
Century Club
(76)
π
Trend Setter
π₯
Unstoppable
(9)
π
Conference Pioneer
Conferences
ACL (26)
EMNLP (25)
NAACL (11)
EACL (4)
ICLR (3)
AAAI (2)
ICML (2)
IJCNLP (2)
NIPS (2)
CONLL (1)
CVPR (1)
Top co-authors
Keywords
large language model
(25)
language model
(11)
text classification
(9)
commonsense reasoning
(8)
toxicity detection
(7)
natural language processing
(6)
ai safety
(5)
question answering
(5)
hate speech detection
(5)
text generation
(4)
social media analysis
(4)
toxic language detection
(4)
multi-agent system
(4)
sentiment analysis
(3)
bias mitigation
(3)
theory of mind
(3)
commonsense knowledge
(3)
natural language understanding
(3)
social intelligence
(3)
natural language inference
(3)
Papers
Common Sense or Ableism? Rethinking Commonsense Reasoning Through the Lens of Disability
EACL 2026
Social Story Frames: Contextual Reasoning about Narrative Intent and Reception
ACL 2026
Out of Style: RAGβs Fragility to Linguistic Variation
EACL 2026
Rejected Dialects: Biases Against African American Language in Reward Models
NAACL 2025
SOTOPIA-S4: a user-friendly system for flexible, customizable, and large-scale social simulation
NAACL 2025
AI-LieDar : Examine the Trade-off Between Utility and Truthfulness in LLM Agents
NAACL 2025
REL-A.I.: An Interaction-Centered Approach To Measuring Human-LM Reliance
NAACL 2025
NormAd: A Framework for Measuring the Cultural Adaptability of Large Language Models
NAACL 2025
SafetyAnalyst: Interpretable, Transparent, and Steerable Safety Moderation for AI Behavior
ICML 2025
On the Resilience of LLM-Based Multi-Agent Collaboration with Faulty Agents
ICML 2025
SOCIAL SCAFFOLDS: A Generalization Framework for Social Understanding Tasks
EMNLP 2025
Synthetic Socratic Debates: Examining Persona Effects on Moral Decision and Persuasion Dynamics
EMNLP 2025
Words Like Knives: Backstory-Personalized Modeling and Detection of Violent Communication
EMNLP 2025
BIG5-CHAT: Shaping LLM Personalities Through Training on Human-Grounded Data
ACL 2025
Mind the Gesture: Evaluating AI Sensitivity to Culturally Offensive Non-Verbal Gestures
ACL 2025
Mitigating Bias in RAG: Controlling the Embedder
ACL 2025
Stereotype or Personalization? User Identity Biases Chatbot Recommendations
ACL 2025
1-2-3 Check: Enhancing Contextual Privacy in LLM via Multi-Agent Reasoning
ACL 2025
AutoPresent: Designing Structured Visuals from Scratch
CVPR 2025
Let Them Down Easy! Contextual Effects of LLM Guardrails on User Perceptions and Preferences
EMNLP 2025
Relying on the Unreliable: The Impact of Language Modelsβ Reluctance to Express Uncertainty
ACL 2024
Where Do People Tell Stories Online? Story Detection Across Online Communities
ACL 2024
SOTOPIA-Ο: Interactive Learning of Socially Intelligent Language Agents
ACL 2024
Is the Pope Catholic? Yes, the Pope is Catholic. Generative Evaluation of Non-Literal Intent Resolution in LLMs
ACL 2024
Clever Hans or Neural Theory of Mind? Stress Testing Social Reasoning in Large Language Models
EACL 2024
Leftover Lunch: Advantage-based Offline Reinforcement Learning for Language Models
ICLR 2024
Can LLMs Keep a Secret? Testing Privacy Implications of Language Models via Contextual Integrity Theory
ICLR 2024
SOTOPIA: Interactive Evaluation for Social Intelligence in Language Agents
ICLR 2024
WildTeaming at Scale: From In-the-Wild Jailbreaks to (Adversarially) Safer Language Models
NIPS 2024
HEART-felt Narratives: Tracing Empathy and Narrative Style in Personal Stories with LLMs
EMNLP 2024
Value Kaleidoscope: Engaging AI with Pluralistic Human Values, Rights, and Duties
AAAI 2024
The Empirical Variability of Narrative Perceptions of Social Media Texts
EMNLP 2024
Is this the real life? Is this just fantasy? The Misleading Success of Simulating Social Interactions With LLMs
EMNLP 2024
SODA: Million-scale Dialogue Distillation with Social Commonsense Contextualization
EMNLP 2023
NLPositionality: Characterizing Design Biases of Datasets and Models
ACL 2023
From Dogwhistles to Bullhorns: Unveiling Coded Rhetoric with Language Models
ACL 2023
Detoxifying Text with MaRCo: Controllable Revision with Experts and Anti-Experts
ACL 2023
Riveter: Measuring Power and Social Dynamics Between Entities
ACL 2023
COBRA Frames: Contextual Reasoning about Effects and Harms of Offensive Statements
ACL 2023
BiasX: βThinking Slowβ in Toxic Content Moderation with Explanations of Implied Social Biases
EMNLP 2023
Modeling Empathic Similarity in Personal Narratives
EMNLP 2023
Donβt Take This Out of Context!: On the Need for Contextual Models and Evaluations for Stylistic Rewriting
EMNLP 2023
FANToM: A Benchmark for Stress-testing Machine Theory of Mind in Interactions
EMNLP 2023
Beyond Denouncing Hate: Strategies for Countering Implied Biases and Stereotypes in Language
EMNLP 2023
ProsocialDialog: A Prosocial Backbone for Conversational Agents
EMNLP 2022
When to Make Exceptions: Exploring Language Models as Accounts of Human Moral Judgment
NIPS 2022
ToxiGen: A Large-Scale Machine-Generated Dataset for Adversarial and Implicit Hate Speech Detection
ACL 2022
Misinfo Reaction Frames: Reasoning about Readersβ Reactions to News Headlines
ACL 2022
Annotators with Attitudes: How Annotator Beliefs And Identities Bias Toxic Language Detection
NAACL 2022
Aligning to Social Norms and Values in Interactive Narratives
NAACL 2022
Uncovering Surprising Event Boundaries in Narratives
NAACL 2022
Neural Theory-of-Mind? On the Limits of Social Intelligence in Large LMs
EMNLP 2022
Crowdsourcing Beyond Annotation: Case Studies in Benchmark Data Collection
EMNLP 2021
Just Say No: Analyzing the Stance of Neural Dialogue Generation in Offensive Contexts
EMNLP 2021
DExperts: Decoding-Time Controlled Text Generation with Experts and Anti-Experts
ACL 2021
Documenting Large Webtext Corpora: A Case Study on the Colossal Clean Crawled Corpus
EMNLP 2021
Detoxifying Language Models Risks Marginalizing Minority Voices
NAACL 2021
Challenges in Automated Debiasing for Toxic Language Detection
EACL 2021
DExperts: Decoding-Time Controlled Text Generation with Experts and Anti-Experts
IJCNLP 2021
Social Chemistry 101: Learning to Reason about Social and Moral Norms
EMNLP 2020
Recollection versus Imagination: Exploring Human Memory and Cognition via Neural Language Models
ACL 2020
RealToxicityPrompts: Evaluating Neural Toxic Degeneration in Language Models
EMNLP 2020
Exploring the Effect of Author and Reader Identity in Online Story Writing: the STORIESINTHEWILD Corpus.
ACL 2020
Commonsense Reasoning for Natural Language Processing
ACL 2020
Social Bias Frames: Reasoning about Social and Power Implications of Language
ACL 2020
PowerTransformer: Unsupervised Controllable Revision for Biased Language Correction
EMNLP 2020
Social IQa: Commonsense Reasoning about Social Interactions
IJCNLP 2019
COMET: Commonsense Transformers for Automatic Knowledge Graph Construction
ACL 2019
ATOMIC: An Atlas of Machine Commonsense for If-Then Reasoning
AAAI 2019
The Risk of Racial Bias in Hate Speech Detection
ACL 2019
Social IQa: Commonsense Reasoning about Social Interactions
EMNLP 2019
Sounding Board: A User-Centric and Content-Driven Social Chatbot
NAACL 2018
Event2Mind: Commonsense Inference on Events, Intents, and Reactions
ACL 2018
Modeling Naive Psychology of Characters in Simple Commonsense Stories
ACL 2018
Connotation Frames of Power and Agency in Modern Films
EMNLP 2017
The Effect of Different Writing Tasks on Linguistic Style: A Case Study of the ROC Story Cloze Task
CONLL 2017
DLATK: Differential Language Analysis ToolKit
EMNLP 2017
Extracting Human Temporal Orientation from Facebook Language
NAACL 2015
Developing Age and Gender Predictive Lexica over Social Media
EMNLP 2014