Dan Jurafsky

132 papers · 2004–2026 · 13 conferences · across top CS/AI conferences

Achievements

+18 more ↓

🧭 Keyword Pioneer 🗺️ Taxonomy Completionist (14) 🌉 Interdisciplinary Bridge 🌈 Renaissance Researcher (5) 🌍 Conference Polyglot (13)

🌉 Interdisciplinary Bridge 🗺️ Taxonomy Completionist (14) 🧭 Keyword Pioneer 🏠 Conference Loyalist (22) 🌟 Keyword Trendsetter Combo (3) 🏆 Grand Slam 👑 Triple Crown 🏆 Keyword Champion (2) 🌱 Topic Pioneer 🔬 Deep Specialist (13) 🧬 Topic Evolution 🚀 Conference Pioneer ⚡ Prolific Year (7) 🗃️ Keyword Collector (379) 📈 Trend Setter ❓ The Questioner (11) 🔥 Unstoppable (11) 💎 Century Club (129)

Conferences

ACL (37) EMNLP (28) NAACL (22) EACL (9) IJCNLP (7) ICLR (6) NIPS (6) INTERSPEECH (5) ICML (4) CONLL (3) COLING (2) JMLR (2) AAAI (1)

Top co-authors

Jiwei Li (8) Christopher D. Manning (8) Mirac Suzgun (7) Kawin Ethayarajh (7) Christopher Potts (7) Nathanael Chambers (7) Tatsunori Hashimoto (6) Kristina Gligoric (5) Kaitlyn Zhou (5) Federico Bianchi (5)

Research topics

Applications (1)

Keywords

language model (8) automatic speech recognition (6) large language model (6) text classification (5) low-resource language (5) representation learning (5) text generation (5) natural language generation (4) transfer learning (4) reinforcement learning (3) social media analysis (3) word embedding (3) discourse coherence (3) sentiment analysis (3) domain adaptation (3) algorithmic fairness (2) information extraction (2) cross-lingual transfer (2) model evaluation (2) attention mechanism (2)

Papers

Accommodation and Epistemic Vigilance: A Pragmatic Account of Why LLMs Fail to Challenge Harmful Beliefs ACL 2026 Beyond Tokens: Concept-Level Training Objectives for LLMs EACL 2026 Dynamic Cheatsheet: Test-Time Learning with Adaptive Memory EACL 2026 False Friends Are Not Foes: Investigating Vocabulary Overlap in Multilingual Language Models EMNLP 2025 HumT DumT: Measuring and controlling human-like language in LLMs ACL 2025 AxBench: Steering LLMs? Even Simple Baselines Outperform Sparse Autoencoders ICML 2025 Can Unconfident LLM Annotations Be Used for Confident Conclusions? NAACL 2025 In-Context Learning Boosts Speech Recognition via Human-like Adaptation to Speakers and Language Varieties EMNLP 2025 REL-A.I.: An Interaction-Centered Approach To Measuring Human-LM Reliance NAACL 2025 Rethinking Word Similarity: Semantic Similarity through Classification Confusion NAACL 2025 h4rm3l: A Language for Composable Jailbreak Attack Synthesis ICLR 2025 What can large language models do for sustainable food? ICML 2025 AnthroScore: A Computational Linguistic Measure of Anthropomorphism EACL 2024 ReFT: Representation Finetuning for Language Models NIPS 2024 Grounding Gaps in Language Model Generations NAACL 2024 NLP Systems That Can’t Tell Use from Mention Censor Counterspeech, but Teaching the Distinction Helps NAACL 2024 A layer-wise analysis of Mandarin and English suprasegmentals in SSL speech models INTERSPEECH 2024 ML-SUPERB 2.0: Benchmarking Multilingual Speech Models Across Modeling Constraints, Languages, and Datasets INTERSPEECH 2024 Model Alignment as Prospect Theoretic Optimization ICML 2024 How Well Can LLMs Negotiate? NegotiationArena Platform and Analysis ICML 2024 Safety-Tuned LLaMAs: Lessons From Improving the Safety of Large Language Models that Follow Instructions ICLR 2024 A Benchmark for Learning to Translate a New Language from One Grammar Book ICLR 2024 CausalGym: Benchmarking causal interpretability methods on linguistic tasks ACL 2024 string2string: A Modern Python Library for String-to-String Algorithms ACL 2024 SumTablets: A Transliteration Dataset of Sumerian Tablets ACL 2024 Predicting positive transfer for improved low-resource speech recognition using acoustic pseudo-tokens EACL 2024 When and Why Vision-Language Models Behave like Bags-Of-Words, and What to Do About It? ICLR 2023 Multilingual BERT has an Accent: Evaluating English Influences on Fluency in Multilingual Models EACL 2023 Multilingual self-supervised speech representations improve the speech recognition of low-resource African languages with codeswitching EMNLP 2023 Injecting structural hints: Using language models to study inductive biases in language learning EMNLP 2023 Mini But Mighty: Efficient Multilingual Pretraining with Linguistically-Informed Data Selection EACL 2023 Making More of Little Data: Improving Low-Resource Automatic Speech Recognition Using Data Augmentation ACL 2023 Marked Personas: Using Natural Language Prompts to Measure Stereotypes in Language Models ACL 2023 Follow the Wisdom of the Crowd: Effective Text Generation via Minimum Bayes Risk Decoding ACL 2023 When Do Pre-Training Biases Propagate to Downstream Tasks? A Case Study in Text Summarization EACL 2023 Developing Speech Processing Pipelines for Police Accountability INTERSPEECH 2023 Foundation Models and Fair Use JMLR 2023 Multilingual BERT has an accent: Evaluating English influences on fluency in multilingual models EACL 2023 Navigating the Grey Area: How Expressions of Uncertainty and Overconfidence Affect Language Models EMNLP 2023 Ecosystem-level Analysis of Deployed Machine Learning Reveals Homogeneous Outcomes NIPS 2023 Picking on the Same Person: Does Algorithmic Monoculture lead to Outcome Homogenization? NIPS 2022 Problems with Cosine as a Measure of Embedding Similarity for High Frequency Words ACL 2022 Richer Countries and Richer Representations ACL 2022 Modular Domain Adaptation ACL 2022 Automated speech tools for helping communities process restricted-access corpora for language revival efforts ACL 2022 Pile of Law: Learning Responsible Data Filtering from the Law and a 256GB Open-Source Legal Dataset NIPS 2022 The Authenticity Gap in Human Evaluation EMNLP 2022 Computationally Identifying Funneling and Focusing Questions in Classroom Discourse NAACL 2022 Prompt-and-Rerank: A Method for Zero-Shot and Few-Shot Arbitrary Textual Style Transfer with Small Language Models EMNLP 2022 Attention Flows are Shapley Value Explanations IJCNLP 2021 Measuring Conversational Uptake: A Case Study on Student-Teacher Interactions ACL 2021 Attention Flows are Shapley Value Explanations ACL 2021 The Emergence of the Shape Bias Results from Communicative Efficiency CONLL 2021 Focus on what matters: Applying Discourse Coherence Theory to Cross Document Coreference EMNLP 2021 The Emergence of the Shape Bias Results from Communicative Efficiency EMNLP 2021 Nearest Neighbor Machine Translation ICLR 2021 Measuring Conversational Uptake: A Case Study on Student-Teacher Interactions IJCNLP 2021 Causal Effects of Linguistic Properties NAACL 2021 Improving Factual Completeness and Consistency of Image-to-Text Radiology Report Generation NAACL 2021 Generalization through Memorization: Nearest Neighbor Language Models ICLR 2020 Utility is in the Eye of the User: A Critique of NLP Leaderboards EMNLP 2020 Learning Music Helps You Read: Using Transfer to Study Linguistic Structure in Language Models EMNLP 2020 Towards the Systematic Reporting of the Energy and Carbon Footprints of Machine Learning JMLR 2020 Pretraining with Contrastive Sentence Objectives Improves Discourse Performance of Language Models ACL 2020 Social Bias Frames: Reasoning about Social and Power Implications of Language ACL 2020 Automatically Neutralizing Subjective Bias in Text AAAI 2020 Language Through a Prism: A Spectral Approach for Multiscale Language Representations NIPS 2020 Detecting Stance in Media On Global Warming EMNLP 2020 With Little Power Comes Great Responsibility EMNLP 2020 Integrating Text and Image: Determining Multimodal Document Intent in Instagram Posts IJCNLP 2019 Neural Text Style Transfer via Denoising and Reranking NAACL 2019 Integrating Text and Image: Determining Multimodal Document Intent in Instagram Posts EMNLP 2019 Recursive Routing Networks: Learning to Compose Modules for Language Understanding NAACL 2019 Let’s Make Your Request More Persuasive: Modeling Persuasive Strategies via Semi-Supervised Neural Nets on Crowdfunding Platforms NAACL 2019 From Insanely Jealous to Insanely Delicious: Computational Models for the Semantic Bleaching of English Intensifiers ACL 2019 Analyzing Polarization in Social Media: Method and Application to Tweets on 21 Mass Shootings NAACL 2019 Framing and Agenda-setting in Russian News: a Computational Analysis of Intricate Political Strategies EMNLP 2018 Textual Analogy Parsing: What’s Shared and What’s Compared among Analogous Facts EMNLP 2018 Deconfounded Lexicon Induction for Interpretable Social Science NAACL 2018 Noising and Denoising Natural Language: Diverse Backtranslation for Grammar Correction NAACL 2018 Automatic Detection of Incoherent Speech for Diagnosing Schizophrenia NAACL 2018 Embedding Logical Queries on Knowledge Graphs NIPS 2018 Sharp Nearby, Fuzzy Far Away: How Neural Language Models Use Context ACL 2018 Adversarial Learning for Neural Dialogue Generation EMNLP 2017 Neural Net Models of Open-domain Discourse Coherence EMNLP 2017 Incorporating Dialectal Variability for Socially Equitable Language Identification ACL 2017 A Two-stage Sieve Approach for Quote Attribution EACL 2017 Deep Reinforcement Learning for Dialogue Generation EMNLP 2016 Between- and Within-Speaker Effects of Bilingualism on F0 Variation INTERSPEECH 2016 Ketchup, Interdisciplinarity, and the Spread of Innovation in Speech and Language Processing INTERSPEECH 2016 Diachronic Word Embeddings Reveal Statistical Laws of Semantic Change ACL 2016 Predicting the Rise and Fall of Scientific Topics from Trends in their Rhetorical Framing ACL 2016 Distinguishing Past, On-going, and Future Events: The EventStatus Corpus EMNLP 2016 Inducing Domain-Specific Sentiment Lexicons from Unlabeled Corpora EMNLP 2016 Visualizing and Understanding Neural Models in NLP NAACL 2016 Cultural Shift or Linguistic Drift? Comparing Two Computational Measures of Semantic Change EMNLP 2016 A Hierarchical Neural Autoencoder for Paragraphs and Documents IJCNLP 2015 When Are Tree Structures Necessary for Deep Learning of Representations? EMNLP 2015 The Users Who Say ‘Ni’: Audience Identification in Chinese-language Restaurant Reviews ACL 2015 A Hierarchical Neural Autoencoder for Paragraphs and Documents ACL 2015 Do Multi-Sense Embeddings Improve Natural Language Understanding? EMNLP 2015 The Users Who Say ‘Ni’: Audience Identification in Chinese-language Restaurant Reviews IJCNLP 2015 Lexicon-Free Conversational Speech Recognition with Neural Networks NAACL 2015 A computational approach to politeness with application to social factors ACL 2013 Linguistic Models for Analyzing and Detecting Biased Language ACL 2013 Implicatures and Nested Beliefs in Approximate Decentralized-POMDPs ACL 2013 Generating Recommendation Dialogs by Extracting Information from User Reviews ACL 2013 He Said, She Said: Gender in the ACL Anthology ACL 2012 Joint Entity and Event Coreference Resolution across Documents EMNLP 2012 Joint Entity and Event Coreference Resolution across Documents CONLL 2012 Towards a Computational History of the ACL: 1980-2008 ACL 2012 A Computational Analysis of Style, Affect, and Imagery in Contemporary Poetry NAACL 2012 Towards a Literary Machine Translation: The Role of Referential Cohesion NAACL 2012 Template-Based Information Extraction without the Templates ACL 2011 Stanford’s Multi-Pass Sieve Coreference Resolution System at the CoNLL-2011 Shared Task CONLL 2011 Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010) COLING 2010 A Multi-Pass Sieve for Coreference Resolution EMNLP 2010 Coling 2010: Posters COLING 2010 Robust Machine Translation Evaluation with Entailment Features ACL 2009 It’s Not You, it’s Me: Detecting Flirting and its Misperception in Speed-Dates EMNLP 2009 Extracting Social Meaning: Identifying Interactional Style in Spoken Conversation NAACL 2009 Robust Machine Translation Evaluation with Entailment Features IJCNLP 2009 Unsupervised Learning of Narrative Schemas and their Participants IJCNLP 2009 Unsupervised Learning of Narrative Schemas and their Participants ACL 2009 Unsupervised Learning of Narrative Event Chains ACL 2008 Which Words Are Hard to Recognize? Prosodic, Lexical, and Disfluency Factors that Increase ASR Error Rates ACL 2008 Measuring Importance and Query Relevance in Topic-focused Multi-document Summarization ACL 2007 To Memorize or to Predict: Prominence labeling in Conversational Speech NAACL 2007 Disambiguating Between Generic and Referential “You” in Dialog ACL 2007 Classifying Temporal Relations Between Events ACL 2007 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing EMNLP 2006 Shallow Semantic Parsing using Support Vector Machines NAACL 2004