Sebastian Gehrmann

34 papers · 2018–2026 · 6 conferences · across top CS/AI conferences

Achievements

+11 more ↓

🌉 Interdisciplinary Bridge 🌈 Renaissance Researcher (7) 🌍 Conference Polyglot (6) 🏃 Academic Marathon (6) 🗺️ Taxonomy Completionist (61)

🧭 Keyword Pioneer 🐝 Cross-Pollinator (15) 🌈 Renaissance Researcher (7) 🔬 Deep Specialist (10) 👥 Mega-Team (77) 🧬 Topic Evolution ❓ The Questioner (2) 🔥 Unstoppable (7) 🗃️ Keyword Collector (148) 💎 Century Club (33) ⚡ Prolific Year (5)

Conferences

EMNLP (14) ACL (13) IJCNLP (3) NAACL (2) JMLR (1) NIPS (1)

Top co-authors

Ankur Parikh (7) Hendrik Strobelt (7) Thibault Sellam (6) Simon Mille (5) Dipanjan Das (5) João Sedoc (5) Vitaly Nikolaev (5) Yonatan Belinkov (5) Angelina McMillan-Major (5) Salomey Osei (5)

Keywords

large language model (6) language model (6) neural network (5) evaluation metric (4) text generation (4) attention mechanism (3) causal mediation analysis (3) natural language generation (3) subject-verb agreement (2) benchmark evaluation (2) table-to-text generation (2) knowledge distillation (2) machine translation (2) multilingual model (2) few-shot learning (2) pre-trained language model (2) neural network interpretability (2) summarization evaluation (2) model interpretability (2) multilingual nlp (2)

Papers

Domain Generalizable AI Guardrails with Augmented Policy Training ACL 2026 Aya Dataset: An Open-Access Collection for Multilingual Instruction Tuning ACL 2024 Academics Can Contribute to Domain-Specialized Language Models EMNLP 2024 Do LLMs Plan Like Human Writers? Comparing Journalist Coverage of Press Releases with LLMs EMNLP 2024 Can We Statically Locate Knowledge in Large Language Models? Financial Domain and Toxicity Reduction Case Studies EMNLP 2024 On the Role of Summary Content Units in Text Summarization Evaluation NAACL 2024 SEAHORSE: A Multilingual, Multifaceted Dataset for Summarization Evaluation EMNLP 2023 Dialect-robust Evaluation of Generated Text ACL 2023 Benchmarking Large Language Model Capabilities for Conditional Generation ACL 2023 A Needle in a Haystack: An Analysis of High-Agreement Workers on MTurk for Summarization ACL 2023 Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them ACL 2023 PaLM: Scaling Language Modeling with Pathways JMLR 2023 TaTA: A Multilingual Table-to-Text Dataset for African Languages EMNLP 2023 Intriguing Properties of Compression on Multilingual Models EMNLP 2022 GEMv2: Multilingual NLG Benchmarking in a Single Line of Code EMNLP 2022 The GEM Benchmark: Natural Language Generation, its Evaluation and Metrics IJCNLP 2021 Causal Analysis of Syntactic Agreement Mechanisms in Neural Language Models IJCNLP 2021 Reusable Templates and Guides For Documenting Datasets and Models for Natural Language Processing and Generation: A Case Study of the HuggingFace and GEM Data and Model Cards ACL 2021 Causal Analysis of Syntactic Agreement Mechanisms in Neural Language Models ACL 2021 Learning Compact Metrics for MT EMNLP 2021 LMdiff: A Visual Diff Tool to Compare Language Models EMNLP 2021 The GEM Benchmark: Natural Language Generation, its Evaluation and Metrics ACL 2021 Reusable Templates and Guides For Documenting Datasets and Models for Natural Language Processing and Generation: A Case Study of the HuggingFace and GEM Data and Model Cards IJCNLP 2021 Learning to Evaluate Translation Beyond English: BLEURT Submissions to the WMT Metrics 2020 Shared Task EMNLP 2020 exBERT: A Visual Analysis Tool to Explore Learned Representations in Transformer Models ACL 2020 Interpretability and Analysis in Neural NLP ACL 2020 ToTTo: A Controlled Table-To-Text Generation Dataset EMNLP 2020 The Language Interpretability Tool: Extensible, Interactive Visualizations and Analysis for NLP Models EMNLP 2020 Investigating Gender Bias in Language Models Using Causal Mediation Analysis NIPS 2020 Improving Human Text Comprehension through Semi-Markov CRF-based Neural Section Title Generation NAACL 2019 GLTR: Statistical Detection and Visualization of Generated Text ACL 2019 LSTM Networks Can Perform Dynamic Counting ACL 2019 Debugging Sequence-to-Sequence Models with Seq2Seq-Vis EMNLP 2018 Bottom-Up Abstractive Summarization EMNLP 2018