Samuel R. Bowman

53 papers · 2012–2025 · 10 conferences · across top CS/AI conferences

Achievements

+18 more ↓

🧭 Keyword Pioneer 🐣 Hot Topic Early Bird 🗺️ Taxonomy Completionist (16) 🌉 Interdisciplinary Bridge 🌍 Conference Polyglot (10)

🌉 Interdisciplinary Bridge 🗺️ Taxonomy Completionist (16) 🐣 Hot Topic Early Bird 🌟 Keyword Trendsetter Combo (3) 🏆 Grand Slam 👑 Triple Crown 🏆 Keyword Champion (2) 🤝 Dynamic Duo (12) 👥 Mega-Team (63) 🔬 Deep Specialist (12) 🧬 Topic Evolution 🚀 Conference Pioneer ⚡ Prolific Year (11) 🗃️ Keyword Collector (191) 📈 Trend Setter 💎 Century Club (53) 🔥 Unstoppable (7) ❓ The Questioner (10)

Conferences

EMNLP (15) ACL (14) NAACL (6) IJCNLP (5) ICLR (4) AACL (2) CONLL (2) ICML (2) NIPS (2) AAAI (1)

Top co-authors

Jason Phang (12) Haokun Liu (11) Clara Vania (9) Alex Warstadt (9) Phu Mon Htut (8) Alex Wang (7) Nikita Nangia (7) Katharina Kann (7) William Huang (6) Ethan Perez (6)

Keywords

natural language understanding (8) natural language inference (7) data collection (5) transfer learning (5) cross-lingual transfer (4) benchmark evaluation (4) pretrained language model (4) language model (4) dataset creation (3) transformer model (3) natural language processing (3) low-resource language (3) social bia (3) large language model (3) question answering (3) language understanding (3) unsupervised parsing (2) text classification (2) representation learning (2) semantic knowledge (2)

Papers

Language Models Learn to Mislead Humans via RLHF ICLR 2025 Debating with More Persuasive LLMs Leads to More Truthful Answers ICML 2024 Towards Understanding Sycophancy in Language Models ICLR 2024 Many-shot Jailbreaking NIPS 2024 LLM Evaluators Recognize and Favor Their Own Generations NIPS 2024 Instruction Induction: From Few Examples to Natural Language Task Descriptions ACL 2023 What Do NLP Researchers Believe? Results of the NLP Community Metasurvey ACL 2023 ScoNe: Benchmarking Negation Reasoning in Language Models With Fine-Tuning and In-Context Learning ACL 2023 Discovering Language Model Behaviors with Model-Written Evaluations ACL 2023 Pretraining Language Models with Human Preferences ICML 2023 (QA)2: Question Answering with Questionable Assumptions ACL 2023 SocioProbe: What, When, and Where Language Models Learn about Sociodemographics EMNLP 2022 Clean or Annotate: How to Spend a Limited Data Collection Budget NAACL 2022 Adversarially Constructed Evaluation Sets Are More Challenging, but May Not Be Fair NAACL 2022 SQuALITY: Building a Long-Document Summarization Dataset the Hard Way EMNLP 2022 NOPE: A Corpus of Naturally-Occurring Presuppositions in English CONLL 2021 When Do You Need Billions of Words of Pretraining Data? ACL 2021 Comparing Test Sets with Item Response Theory ACL 2021 What Ingredients Make for an Effective Crowdsourcing Protocol for Difficult NLU Data Collection Tasks? ACL 2021 Crowdsourcing Beyond Annotation: Case Studies in Benchmark Data Collection EMNLP 2021 Does Putting a Linguist in the Loop Improve NLU Data Collection? EMNLP 2021 Fine-Tuned Transformers Show Clusters of Similar Representations Across Layers EMNLP 2021 NOPE: A Corpus of Naturally-Occurring Presuppositions in English EMNLP 2021 When Do You Need Billions of Words of Pretraining Data? IJCNLP 2021 Comparing Test Sets with Item Response Theory IJCNLP 2021 What Ingredients Make for an Effective Crowdsourcing Protocol for Difficult NLU Data Collection Tasks? IJCNLP 2021 What Will it Take to Fix Benchmarking in Natural Language Understanding? NAACL 2021 New Protocols and Negative Results for Textual Entailment Data Collection EMNLP 2020 Precise Task Formalization Matters in Winograd Schema Evaluations EMNLP 2020 Counterfactually-Augmented SNLI Training Data Does Not Yield Better Generalization Than Unaugmented Data EMNLP 2020 Learning to Learn Morphological Inflection for Resource-Poor Languages AAAI 2020 English Intermediate-Task Training Improves Zero-Shot Cross-Lingual Transfer Too AACL 2020 jiant: A Software Toolkit for Research on General-Purpose Text Understanding Models ACL 2020 Intermediate-Task Transfer Learning with Pretrained Language Models: When and Why Does It Work? ACL 2020 Asking Crowdworkers to Write Entailment Examples: The Best of Bad Options AACL 2020 Self-Training for Unsupervised Parsing with PRPN ACL 2020 Learning Which Features Matter: RoBERTa Acquires a Preference for Linguistic Generalizations (Eventually) EMNLP 2020 CrowS-Pairs: A Challenge Dataset for Measuring Social Biases in Masked Language Models EMNLP 2020 Human vs. Muppet: A Conservative Estimate of Human Performance on the GLUE Benchmark ACL 2019 Can You Tell Me How to Get Past Sesame Street? Sentence-Level Pretraining Beyond Language Modeling ACL 2019 On Measuring Social Biases in Sentence Encoders NAACL 2019 Investigating BERT’s Knowledge of Language: Five Analysis Methods with NPIs EMNLP 2019 GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding ICLR 2019 What do you learn from context? Probing for sentence structure in contextualized word representations ICLR 2019 Towards Realistic Practices In Low-Resource Natural Language Processing: The Development Set EMNLP 2019 Investigating BERT’s Knowledge of Language: Five Analysis Methods with NPIs IJCNLP 2019 Towards Realistic Practices In Low-Resource Natural Language Processing: The Development Set IJCNLP 2019 Identifying and Reducing Gender Bias in Word-Level Language Models NAACL 2019 Neural Unsupervised Parsing Beyond English EMNLP 2019 A Fast Unified Model for Parsing and Sentence Understanding ACL 2016 Generating Sentences from a Continuous Space CONLL 2016 A large annotated corpus for learning natural language inference EMNLP 2015 Automatic Animacy Classification NAACL 2012