David Mimno
33 papers · 2009–2026 · 9 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+12 more ↓ Show less ↑
🐝 Cross-Pollinator (10) 🧭 Keyword Pioneer 🏃 Academic Marathon (16) 🌍 Conference Polyglot (9) 🌈 Renaissance Researcher (8)
🧭
Keyword Pioneer
🐣
Hot Topic Early Bird
🐝
Cross-Pollinator
(10)
🧬
Topic Evolution
🏆
Keyword Champion
(3)
❓
The Questioner
🚀
Conference Pioneer
💎
Century Club
(31)
🔥
Unstoppable
(9)
⚡
Prolific Year
(5)
📈
Trend Setter
🗃️
Keyword Collector
(129)
Conferences
EMNLP (15)
EACL (5)
IJCNLP (3)
NAACL (3)
ICML (2)
NIPS (2)
ACL (1)
AISTATS (1)
COLING (1)
Top co-authors
Research topics
Keywords
topic model
(4)
language model
(4)
topic modeling
(4)
unsupervised learning
(4)
social bia
(3)
topic inference
(3)
multimodal learning
(3)
pretraining datum
(2)
large language model
(2)
spectral topic model
(2)
literary text
(2)
digital humanities
(2)
seed lexicon
(2)
anchor word
(2)
probabilistic modeling
(2)
bias measurement
(2)
matrix factorization
(2)
latent topic analysis
(2)
spectral algorithm
(2)
image-sentence matching
(2)
Papers
Show or Tell? Modeling the evolution of request-making in Human-LLM conversations
EACL 2026
Too Long, Didn’t Model: Decomposing LLM Long Context Understanding With Novels
EACL 2026
A City of Millions: Mapping Literary Social Networks At Scale
NAACL 2025
A Pretrainer’s Guide to Training Data: Measuring the Effects of Data Age, Domain Coverage, Quality, & Toxicity
NAACL 2024
Contextualized Topic Coherence Metrics
EACL 2024
[Lions: 1] and [Tigers: 2] and [Bears: 3], Oh My! Literary Coreference Annotation with LLMs
EACL 2024
Hyperpolyglot LLMs: Cross-Lingual Interpretability in Token Embeddings
EMNLP 2023
Data Similarity is Not Enough to Explain Language Model Performance
EMNLP 2023
Modeling Legal Reasoning: LM Annotation at the Edge of Human Agreement
EMNLP 2023
On-the-fly Rectification for Robust Large-Vocabulary Topic Inference
ICML 2021
Bad Seeds: Evaluating Lexical Methods for Bias Measurement
ACL 2021
Comparing Text Representations: A Theory-Driven Approach
EMNLP 2021
Bad Seeds: Evaluating Lexical Methods for Bias Measurement
IJCNLP 2021
‘Tecnologica cosa’: Modeling Storyteller Personalities in Boccaccio’s ‘Decameron’
EMNLP 2021
Prior-aware Composition Inference for Spectral Topic Models
AISTATS 2020
Domain-Specific Lexical Grounding in Noisy Visual-Textual Documents
EMNLP 2020
Practical Correlated Topic Modeling and Analysis via the Rectified Anchor Word Algorithm
EMNLP 2019
Unsupervised Discovery of Multimodal Links in Multi-image, Multi-sentence Documents
EMNLP 2019
Unsupervised Discovery of Multimodal Links in Multi-image, Multi-sentence Documents
IJCNLP 2019
Practical Correlated Topic Modeling and Analysis via the Rectified Anchor Word Algorithm
IJCNLP 2019
Authorless Topic Models: Biasing Models Away from Known Structure
COLING 2018
Quantifying the Visual Concreteness of Words and Topics in Multimodal Datasets
NAACL 2018
Quantifying the Effects of Text Duplication on Semantic Models
EMNLP 2017
Pulling Out the Stops: Rethinking Stopword Removal for Topic Models
EACL 2017
The strange geometry of skip-gram with negative sampling
EMNLP 2017
Beyond Exchangeability: The Chinese Voting Process
NIPS 2016
Robust Spectral Inference for Joint Stochastic Matrix Factorization
NIPS 2015
Evaluation methods for unsupervised word embeddings
EMNLP 2015
Low-dimensional Embeddings for Interpretable Anchor-based Topic Inference
EMNLP 2014
A Practical Algorithm for Topic Modeling with Provable Guarantees
ICML 2013
Optimizing Semantic Coherence in Topic Models
EMNLP 2011
Bayesian Checking for Topic Models
EMNLP 2011
Polylingual Topic Models
EMNLP 2009