Dan Klein

216 papers · 2001–2025 · 12 conferences · across top CS/AI conferences

Achievements

+16 more ↓

🗺️ Taxonomy Completionist (28) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🌈 Renaissance Researcher (9) 🐣 Hot Topic Early Bird

🌉 Interdisciplinary Bridge 🐣 Hot Topic Early Bird 🧭 Keyword Pioneer 🌟 Keyword Trendsetter Combo (14) 🏠 Conference Loyalist (50) 🧬 Topic Evolution 🤝 Dynamic Duo (20) 👑 Triple Crown 🔬 Deep Specialist (11) 🏆 Keyword Champion ⚡ Prolific Year (6) ❓ The Questioner (11) 🗃️ Keyword Collector (72) 📈 Trend Setter 💎 Century Club (216) 🔥 Unstoppable (25)

Conferences

ACL (74) EMNLP (50) NAACL (41) CONLL (15) NIPS (11) IJCNLP (9) ICML (6) COLING (3) ICLR (3) CVPR (2) ICCV (1) IJCAI (1)

Top co-authors

Adam Pauls (20) Jacob Andreas (20) Taylor Berg-Kirkpatrick (19) Percy Liang (18) John DeNero (14) Slav Petrov (13) Aria Haghighi (12) Trevor Darrell (12) Kevin Yang (12) Daniel Fried (12)

Research topics

Linguistics (1)

Keywords

large language model (11) language model (10) constituency parsing (10) neural network (9) syntactic parsing (5) instruction following (5) multimodal learning (5) unsupervised learning (5) probabilistic model (4) semantic parsing (4) natural language understanding (4) prompt engineering (3) speech synthesis (3) text generation (3) beam search (3) text classification (3) language grounding (3) reward function (3) distribution shift (3) code generation (3)

Papers

Balancing Quality and Variation: Spam Filtering Distorts Data Label Distributions EMNLP 2025 FactTrack: Time-Aware World State Tracking in Story Outlines NAACL 2025 ThoughtSculpt: Reasoning with Intermediate Revision and Search NAACL 2025 Enough Coin Flips Can Make LLMs Act Bayesian ACL 2025 Pose Priors from Language Models CVPR 2025 LangProBe: a Language Program Benchmark EMNLP 2025 The Perspectivist Paradigm Shift: Assumptions and Challenges of Capturing Human Labels NAACL 2024 Which One? Leveraging Context Between Objects and Multiple Views for Language Grounding NAACL 2024 What Evidence Do Language Models Find Convincing? ACL 2024 Explaining Datasets in Words: Statistical Models with Natural Language Parameters NIPS 2024 Re-evaluating the Need for Visual Signals in Unsupervised Grammar Induction NAACL 2024 Linguistic Bias in ChatGPT: Language Models Reinforce Dialect Discrimination EMNLP 2024 RLCD: Reinforcement Learning from Contrastive Distillation for LM Alignment ICLR 2024 Learning to Model the World With Language ICML 2024 Inferring Ontological Categories of OWL Classes Using Foundational Rules (Extended Abstract) IJCAI 2024 American Sign Language Handshapes Reflect Pressures for Communicative Efficiency ACL 2024 Ghostbuster: Detecting Text Ghostwritten by Large Language Models NAACL 2024 DOC: Improving Long Story Coherence With Detailed Outline Control ACL 2023 The Whole Truth and Nothing But the Truth: Faithful and Controllable Dialogue Response Generation with Dataflow Transduction and Constrained Decoding ACL 2023 PREADD: Prefix-Adaptive Decoding for Controlled Text Generation ACL 2023 Are Layout-Infused Language Models Robust to Layout Distribution Shifts? A Case Study with Scientific Documents ACL 2023 Goal Driven Discovery of Distributional Differences via Language Descriptions NIPS 2023 Poisoning Language Models During Instruction Tuning ICML 2023 Discovering Latent Knowledge in Language Models Without Supervision ICLR 2023 Can Language Models Learn to Listen? ICCV 2023 Revisiting Entropy Rate Constancy in Text EMNLP 2023 Improving Pacing in Long-Form Story Planning EMNLP 2023 Decomposing Complex Queries for Tip-of-the-tongue Retrieval EMNLP 2023 Centering the Margins: Outlier-Based Identification of Harmed Populations in Toxicity Detection EMNLP 2023 When the Majority is Wrong: Modeling Annotator Disagreement for Subjective Tasks EMNLP 2023 Non-Programmers Can Label Programs Indirectly via Active Examples: A Case Study with Text-to-SQL EMNLP 2023 Incorporating Worker Perspectives into MTurk Annotation Practices for NLP EMNLP 2023 Neural Unsupervised Reconstruction of Protolanguage Word Forms ACL 2023 Modular Visual Question Answering via Code Generation ACL 2023 Inferring Rewards from Language in Context ACL 2022 Voxel-informed Language Grounding ACL 2022 Understanding Game-Playing Agents with Natural Language Annotations ACL 2022 Describing Differences between Text Distributions with Natural Language ICML 2022 Re3: Generating Longer Stories With Recursive Reprompting and Revision EMNLP 2022 Automated Crossword Solving ACL 2022 Learned Incremental Representations for Parsing ACL 2022 Value-Agnostic Conversational Semantic Parsing ACL 2021 An Improved Model for Voicing Silent Speech ACL 2021 Are Larger Pretrained Language Models Uniformly Better? Comparing Performance at the Instance Level ACL 2021 Adapting Language Models for Zero-shot Learning by Meta-tuning on Dataset and Prompt Collections EMNLP 2021 Constrained Language Models Yield Few-Shot Semantic Parsers EMNLP 2021 Reference-Centric Models for Grounded Collaborative Dialogue EMNLP 2021 Modular Networks for Compositional Instruction Following NAACL 2021 Detoxifying Language Models Risks Marginalizing Minority Voices NAACL 2021 FUDGE: Controlled Text Generation With Future Discriminators NAACL 2021 Constructing Taxonomies from Pretrained Language Models NAACL 2021 Interactive Assignments for Teaching Structured Neural NLP NAACL 2021 Learning Space Partitions for Path Planning NIPS 2021 Value-Agnostic Conversational Semantic Parsing IJCNLP 2021 An Improved Model for Voicing Silent Speech IJCNLP 2021 Are Larger Pretrained Language Models Uniformly Better? Comparing Performance at the Instance Level IJCNLP 2021 Calibrate Before Use: Improving Few-shot Performance of Language Models ICML 2021 Semantic Evaluation for Text-to-SQL with Distilled Test Suites EMNLP 2020 Tetra-Tagging: Word-Synchronous Parsing with Linear-Time Inference ACL 2020 Multilingual Alignment of Contextual Word Representations ICLR 2020 Semantic Scaffolds for Pseudocode-to-Code Generation ACL 2020 Train Big, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers ICML 2020 Digital Voicing of Silent Speech EMNLP 2020 Unsupervised Parsing via Constituency Tests EMNLP 2020 A Streaming Approach For Efficient Batched Beam Search EMNLP 2020 A Deep Factorization of Style and Structure in Fonts EMNLP 2019 Pragmatically Informative Text Generation NAACL 2019 Are You Looking? Grounding to Multiple Modalities in Vision-and-Language Navigation ACL 2019 A Deep Factorization of Style and Structure in Fonts IJCNLP 2019 Cross-Domain Generalization of Neural Constituency Parsers ACL 2019 Pre-Learning Environment Representations for Data-Efficient Neural Instruction Following ACL 2019 Multilingual Constituency Parsing with Self-Attention and Pre-Training ACL 2019 Learning with Latent Language NAACL 2018 Speaker-Follower Models for Vision-and-Language Navigation NIPS 2018 Constituency Parsing with a Self-Attentive Encoder ACL 2018 Policy Gradient as a Proxy for Dynamic Oracles in Constituency Parsing ACL 2018 What’s Going On in Neural Constituency Parsers? An Analysis NAACL 2018 Unified Pragmatic Models for Generating and Following Instructions NAACL 2018 Analogs of Linguistic Structure in Deep Representations EMNLP 2017 Translating Neuralese ACL 2017 A Minimal Span-Based Neural Constituency Parser ACL 2017 Abstract Syntax Networks for Code Generation and Semantic Parsing ACL 2017 Improving Neural Parsing by Disentangling Model Combination and Reranking Effects ACL 2017 Fine-Grained Entity Typing with High-Multiplicity Assignments ACL 2017 Where is Misty? Interpreting Spatial Descriptors by Modeling Regions in Space EMNLP 2017 Effective Inference for Generative Neural Parsing EMNLP 2017 Modular Multitask Reinforcement Learning with Policy Sketches ICML 2017 Learning-Based Single-Document Summarization with Compression and Anaphoricity Constraints ACL 2016 Reasoning about Pragmatics with Neural Listeners and Speakers EMNLP 2016 Learning to Compose Neural Networks for Question Answering NAACL 2016 Capturing Semantic Similarity for Entity Linking with Convolutional Neural Networks NAACL 2016 Neural Module Networks CVPR 2016 Unsupervised Code-Switching for Multilingual Historical Document Transcription NAACL 2015 An Empirical Analysis of Optimization for Max-Margin NLP EMNLP 2015 On the Accuracy of Self-Normalized Log-Linear Models NIPS 2015 Neural CRF Parsing IJCNLP 2015 Neural CRF Parsing ACL 2015 Disfluency Detection with a Semi-Markov Model and Prosodic Features NAACL 2015 When and why are log-linear models self-normalizing? NAACL 2015 GPU-Friendly Local Regression for Voice Conversion NAACL 2015 Alignment-Based Compositional Semantics for Instruction Following EMNLP 2015 Sparser, Better, Faster GPU Parsing ACL 2014 Less Grammar, More Features ACL 2014 Structured Learning for Taxonomy Induction with Belief Propagation ACL 2014 Improved Typesetting Models for Historical OCR ACL 2014 How much do word embeddings encode about syntax? ACL 2014 Unsupervised Transcription of Piano Music NIPS 2014 Grounding Language with Points and Paths in Continuous Spaces CONLL 2014 Easy Victories and Uphill Battles in Coreference Resolution EMNLP 2013 Variational Inference for Structured NLP Models ACL 2013 Decentralized Entity-Level Modeling for Coreference Resolution ACL 2013 Unsupervised Transcription of Historical Documents ACL 2013 Error-Driven Analysis of Challenges in Coreference Resolution EMNLP 2013 Decipherment with a Million Random Restarts EMNLP 2013 A Multi-Teraflop Constituency Parser using GPUs EMNLP 2013 An Empirical Examination of Challenges in Chinese Parsing ACL 2013 Large-Scale Syntactic Language Modeling with Treelets ACL 2012 Syntactic Transfer Using a Bilingual Lexicon CONLL 2012 Transforming Trees to Improve Syntactic Convergence CONLL 2012 An Empirical Investigation of Statistical Significance in NLP CONLL 2012 Parser Showdown at the Wall Street Corral: An Empirical Investigation of Error Types in Parser Output CONLL 2012 Training Factored PCFGs with Expectation Propagation CONLL 2012 Variational Inference for Structured NLP Models NAACL 2012 Fast Inference in Phrase Extraction Models with Belief Propagation NAACL 2012 Syntactic Transfer Using a Bilingual Lexicon EMNLP 2012 Transforming Trees to Improve Syntactic Convergence EMNLP 2012 An Empirical Investigation of Statistical Significance in NLP EMNLP 2012 Parser Showdown at the Wall Street Corral: An Empirical Investigation of Error Types in Parser Output EMNLP 2012 Training Factored PCFGs with Expectation Propagation EMNLP 2012 Robust Conversion of CCG Derivations to Phrase Structure Trees ACL 2012 Coreference Semantics from Web Features ACL 2012 Mention Detection: Heuristics for the OntoNotes annotations CONLL 2011 Faster and Smaller N-Gram Language Models ACL 2011 The Surprising Variance in Shortest-Derivation Parsing ACL 2011 An Empirical Investigation of Discounting in Cross-Domain Language Models ACL 2011 Web-Scale Features for Full-Scale Parsing ACL 2011 Learning Dependency-Based Compositional Semantics ACL 2011 Jointly Learning to Extract and Compress ACL 2011 Large-Scale Cognate Recovery EMNLP 2011 Simple Effective Decipherment via Combinatorial Optimization EMNLP 2011 Painless Unsupervised Learning with Features NAACL 2010 Learning Better Monolingual Models with Unannotated Bilingual Text CONLL 2010 A Simple Domain-Independent Probabilistic Approach to Generation EMNLP 2010 Top-Down K-Best A* Parsing ACL 2010 An Entity-Level Approach to Information Extraction ACL 2010 Hierarchical A* Parsing with Bridge Outside Scores ACL 2010 Discriminative Modeling of Extraction Sets for Machine Translation ACL 2010 A Game-Theoretic Approach to Generating Spatial Descriptions EMNLP 2010 Finding Cognate Groups Using Phylogenies ACL 2010 Simple, Accurate Parsing with an All-Fragments Grammar ACL 2010 Phylogenetic Grammar Induction ACL 2010 Unsupervised Syntactic Alignment with Inversion Transduction Grammars NAACL 2010 Joint Parsing and Alignment with Weakly Synchronized Grammars NAACL 2010 Coreference Resolution in a Modular, Entity-Centered Model NAACL 2010 Type-Based MCMC NAACL 2010 Better Word Alignments with Supervised ITG Models IJCNLP 2009 Simple Coreference Resolution with Rich Syntactic and Semantic Features EMNLP 2009 Consensus Training for Consensus Decoding in Machine Translation EMNLP 2009 Asynchronous Binarization for Synchronous Grammars ACL 2009 K-Best A* Parsing ACL 2009 Better Word Alignments with Supervised ITG Models ACL 2009 Learning Semantic Correspondences with Less Supervision ACL 2009 Learning Semantic Correspondences with Less Supervision IJCNLP 2009 K-Best A* Parsing IJCNLP 2009 Asynchronous Binarization for Synchronous Grammars IJCNLP 2009 Improved Reconstruction of Protolanguage Word Forms NAACL 2009 Efficient Parsing for Transducer Grammars NAACL 2009 Hierarchical Search for Parsing NAACL 2009 Online EM for Unsupervised Models NAACL 2009 Randomized Pruning: Efficiently Calculating Expectations in Large Dynamic Programs NIPS 2009 Learning Bilingual Lexicons from Monolingual Corpora ACL 2008 Efficient Inference in Phylogenetic InDel Trees NIPS 2008 The Complexity of Phrase Alignment Problems ACL 2008 Two Languages are Better than One (for Syntactic Parsing) EMNLP 2008 Sparse Multi-Scale Grammars for Discriminative Latent Variable Parsing EMNLP 2008 Sampling Alignment Structure under a Bayesian Translation Model EMNLP 2008 Coarse-to-Fine Syntactic Machine Translation using Language Projections EMNLP 2008 Analyzing the Errors of Unsupervised Learning ACL 2008 Approximate Factoring for A* Search NAACL 2007 Introduction to Classification: Likelihoods, Margins, Features, and Kernels NAACL 2007 Discriminative Log-Linear Grammars with Latent Variables NIPS 2007 Unsupervised Coreference Resolution in a Nonparametric Bayesian Model ACL 2007 A Probabilistic Approach to Language Change NIPS 2007 A Probabilistic Approach to Diachronic Phonology EMNLP 2007 Learning Structured Models for Phone Recognition EMNLP 2007 The Infinite PCFG Using Hierarchical Dirichlet Processes EMNLP 2007 Tailoring Word Alignments to Syntactic Machine Translation ACL 2007 The Infinite PCFG Using Hierarchical Dirichlet Processes CONLL 2007 A Probabilistic Approach to Diachronic Phonology CONLL 2007 Learning Structured Models for Phone Recognition CONLL 2007 Agreement-Based Learning NIPS 2007 Improved Inference for Unlexicalized Parsing NAACL 2007 Proceedings of the Tenth Conference on Computational Natural Language Learning (CoNLL-X) CONLL 2006 Alignment by Agreement NAACL 2006 Word Alignment via Quadratic Assignment NAACL 2006 Prototype-Driven Learning for Sequence Models NAACL 2006 An End-to-End Discriminative Approach to Machine Translation COLING 2006 Learning Accurate, Compact, and Interpretable Tree Annotation COLING 2006 Non-Local Modeling with a Mixture of PCFGs CONLL 2006 Prototype-Driven Grammar Induction ACL 2006 An End-to-End Discriminative Approach to Machine Translation ACL 2006 Learning Accurate, Compact, and Interpretable Tree Annotation ACL 2006 Prototype-Driven Grammar Induction COLING 2006 A Discriminative Matching Approach to Word Alignment EMNLP 2005 Unsupervised Learning of Field Segmentation Models for Information Extraction ACL 2005 Max-Margin Parsing EMNLP 2004 Corpus-Based Induction of Syntactic Structure: Models of Dependency and Constituency ACL 2004 A* Parsing: Fast Exact Viterbi Parse Selection NAACL 2003 Named Entity Recognition with Character-Level Models CONLL 2003 Optimization, Maxent Models, and Conditional Estimation without Magic NAACL 2003 Feature-Rich Part-of-Speech Tagging with a Cyclic Dependency Network NAACL 2003 Accurate Unlexicalized Parsing ACL 2003 A Generative Constituent-Context Model for Improved Grammar Induction ACL 2002 Conditional Structure versus Conditional Estimation in NLP Models EMNLP 2002 Parsing with Treebank Grammars: Empirical Bounds, Theoretical Models, and the Structure of the Penn Treebank ACL 2001 Distributional phrase structure induction CONLL 2001