Esin Durmus

28 papers · 2018–2025 · 8 conferences · across top CS/AI conferences

Achievements

+11 more ↓

🏃 Academic Marathon (7) 🌍 Conference Polyglot (8) 🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🐝 Cross-Pollinator (12)

🌈 Renaissance Researcher (6) 🌍 Conference Polyglot (8) 🏃 Academic Marathon (7) 🤝 Dynamic Duo (13) 👥 Mega-Team (77) 🧬 Topic Evolution 💎 Century Club (28) ❓ The Questioner (3) ⚡ Prolific Year (5) 🗃️ Keyword Collector (104) 🔥 Unstoppable (8)

Conferences

ACL (11) EMNLP (5) IJCNLP (3) NAACL (3) ICLR (2) ICML (2) EACL (1) NIPS (1)

Top co-authors

Faisal Ladhak (13) Claire Cardie (12) Tatsunori Hashimoto (7) Tosin Adewumi (3) He He (3) Aman Madaan (3) Khyathi Raghavi Chandu (3) Vitaly Nikolaev (3) Kaustubh Dhole (3) Simon Mille (3)

Keywords

evaluation metric (5) natural language generation (5) text classification (4) text summarization (4) abstractive summarization (4) natural language processing (3) sentiment analysis (3) hallucination detection (3) large language model (3) argument mining (3) opinion mining (2) reference-free evaluation (2) linguistic feature (2) faithfulness evaluation (2) stance detection (2) pre-trained language model (2) text generation (2) language model (2) persuasion detection (2) semantic similarity (2)

Papers

SafeArena: Evaluating the Safety of Autonomous Web Agents ICML 2025 Many-shot Jailbreaking NIPS 2024 Towards Understanding Sycophancy in Language Models ICLR 2024 NLP Systems That Can’t Tell Use from Mention Censor Counterspeech, but Teaching the Distinction Helps NAACL 2024 Marked Personas: Using Natural Language Prompts to Measure Stereotypes in Language Models ACL 2023 Contrastive Error Attribution for Finetuned Language Models ACL 2023 Towards Reference-free Text Simplification Evaluation with a BERT Siamese Network Architecture ACL 2023 When Do Pre-Training Biases Propagate to Downstream Tasks? A Case Study in Text Summarization EACL 2023 Whose Opinions Do Language Models Reflect? ICML 2023 GEMv2: Multilingual NLG Benchmarking in a Single Line of Code EMNLP 2022 Faithful or Extractive? On Mitigating the Faithfulness-Abstractiveness Trade-off in Abstractive Summarization ACL 2022 Spurious Correlations in Reference-Free Evaluation of Text Generation ACL 2022 Improving Faithfulness by Augmenting Negative Summaries from Fake Documents EMNLP 2022 Language modeling via stochastic processes ICLR 2022 Leveraging Topic Relatedness for Argument Persuasion ACL 2021 The GEM Benchmark: Natural Language Generation, its Evaluation and Metrics ACL 2021 Leveraging Topic Relatedness for Argument Persuasion IJCNLP 2021 The GEM Benchmark: Natural Language Generation, its Evaluation and Metrics IJCNLP 2021 FEQA: A Question Answering Evaluation Framework for Faithfulness Assessment in Abstractive Summarization ACL 2020 Exploring the Role of Argument Structure in Online Debate Persuasion EMNLP 2020 WikiLingua: A New Benchmark Dataset for Cross-Lingual Abstractive Summarization EMNLP 2020 The Role of Pragmatic and Discourse Context in Determining Argument Impact EMNLP 2019 The Role of Pragmatic and Discourse Context in Determining Argument Impact IJCNLP 2019 Determining Relative Argument Specificity and Stance for Complex Argumentative Structures ACL 2019 Persuasion of the Undecided: Language vs. the Listener ACL 2019 A Corpus for Modeling User and Language Effects in Argumentation on Online Debating ACL 2019 Understanding the Effect of Gender and Stance in Opinion Expression in Debates on “Abortion” NAACL 2018 Exploring the Role of Prior Beliefs for Argument Persuasion NAACL 2018