Ani Nenkova

85 papers · 2003–2026 · 12 conferences · across top CS/AI conferences

Achievements

+14 more ↓

🌉 Interdisciplinary Bridge 🏃 Academic Marathon (22) 🌍 Conference Polyglot (12) 🌈 Renaissance Researcher (9) 🗺️ Taxonomy Completionist (76)

🗺️ Taxonomy Completionist (76) 🧭 Keyword Pioneer 🐣 Hot Topic Early Bird 🏠 Conference Loyalist (21) 🤝 Dynamic Duo (11) 🧬 Topic Evolution 🏆 Keyword Champion 🌱 Topic Pioneer 🗃️ Keyword Collector (182) 🔥 Unstoppable (19) 📈 Trend Setter ❓ The Questioner (5) ⚡ Prolific Year (6) 💎 Century Club (83)

Conferences

NAACL (21) EMNLP (19) ACL (17) EACL (8) IJCNLP (8) COLING (3) AACL (2) CONLL (2) ICLR (2) INTERSPEECH (1) NIPS (1) WACV (1)

Top co-authors

Jiuxiang Gu (11) Annie Louis (11) Oshin Agarwal (8) Byron Wallace (8) Ruiyi Zhang (7) Tong Sun (7) Emily Pitler (7) Junyi Jessy Li (7) Alexa Siu (6) Vlad Morariu (6)

Keywords

named entity recognition (6) text summarization (6) information extraction (5) distributed representation (3) summarization evaluation (3) question answering (3) document understanding (3) multimodal learning (3) text embedding (2) domain adaptation (2) language model (2) sequence labeling (2) evaluation metric (2) natural language processing (2) multi-modal learning (2) sequence tagging (2) model evaluation (2) text classification (2) hierarchical structure (2) compression algorithm (2)

Papers

LinkNav: Surfacing Interconnected Information in Scientific Articles ACL 2026 Standardizing the Measurement of Text Diversity: A Tool and Comparative Analysis AACL 2025 Standardizing the Measurement of Text Diversity: A Tool and Comparative Analysis IJCNLP 2025 SQLSpace: A Representation Space for Text-to-SQL to Discover and Mitigate Robustness Gaps EMNLP 2025 LLM-as-a-Judge Failures at Automating the Identification of Poor Quality Outputs in Free-Form Texts EMNLP 2025 PDFTriage: Question Answering over Long, Structured Documents EMNLP 2024 Few-Shot Dialogue Summarization via Skeleton-Assisted Prompt Transfer in Prompt Tuning EACL 2024 How Much Annotation is Needed to Compare Summarization Models? NAACL 2024 Self-Cleaning: Improving a Named Entity Recognizer Trained on Noisy Data with a Few Clean Instances NAACL 2024 ATLAS: A System for PDF-centric Human Interaction Data Collection NAACL 2024 ADOPD: A Large-Scale Document Page Decomposition Dataset ICLR 2024 SOHES: Self-supervised Open-world Hierarchical Entity Segmentation ICLR 2024 A Critical Analysis of Document Out-of-Distribution Detection EMNLP 2023 Learning the Visualness of Text Using Large Vision-Language Models EMNLP 2023 Factual or Contextual? Disentangling Error Types in Entity Description Generation ACL 2023 LayerDoc: Layer-Wise Extraction of Spatial Hierarchical Structure in Visually-Rich Documents WACV 2023 Named Entity Recognition in a Very Homogenous Domain EACL 2023 Multi-domain Summarization from Leaderboards to Practice: Re-examining Automatic and Human Evaluation EMNLP 2023 “Am I Answering My Job Interview Questions Right?”: A NLP Approach to Predict Degree of Explanation in Job Interview Responses EMNLP 2022 DocTime: A Document-level Temporal Dependency Graph Parser NAACL 2022 Self-Repetition in Abstractive Neural Summarizers AACL 2022 Learning Adaptive Axis Attentions in Fine-tuning: Beyond Fixed Sparse Attention Patterns ACL 2022 DocLayoutTTS: Dataset and Baselines for Layout-informed Document-level Neural Speech Synthesis INTERSPEECH 2022 Self-Repetition in Abstractive Neural Summarizers IJCNLP 2022 MGDoc: Pre-training with Multi-granular Hierarchy for Document Image Understanding EMNLP 2022 Influence Functions for Sequence Tagging Models EMNLP 2022 Context-aware Information-theoretic Causal De-biasing for Interactive Sequence Labeling EMNLP 2022 The Utility and Interplay of Gazetteers and Entity Segmentation for Named Entity Recognition in English ACL 2021 UniDoc: Unified Pretraining Framework for Document Understanding NIPS 2021 The Utility and Interplay of Gazetteers and Entity Segmentation for Named Entity Recognition in English IJCNLP 2021 From Toxicity in Online Comments to Incivility in American News: Proceed with Caution EACL 2021 Trialstreamer: Mapping and Browsing Medical Evidence in Real-Time ACL 2020 The Feasibility of Embedding Based Automatic Evaluation for Single Document Summarization IJCNLP 2019 The Feasibility of Embedding Based Automatic Evaluation for Single Document Summarization EMNLP 2019 Evaluation of named entity coreference NAACL 2019 Browsing Health: Information Extraction to Support New Interfaces for Accessing Medical Evidence NAACL 2019 Predicting Annotation Difficulty to Improve Task Routing and Model Performance for Biomedical Information Extraction NAACL 2019 How to Compare Summarizers without Target Length? Pitfalls, Solutions and Re-Examination of the Neural Summarization Literature NAACL 2019 Emotion Impacts Speech Recognition Performance NAACL 2019 Evaluating Multiple System Summary Lengths: A Case Study EMNLP 2018 A Corpus with Multi-Level Annotations of Patients, Interventions and Outcomes to Support Language Processing for Medical Literature ACL 2018 Syntactic Patterns Improve Information Extraction for Medical Search NAACL 2018 Detecting (Un)Important Content for Single-Document News Summarization EACL 2017 Aggregating and Predicting Sequence Labels from Crowd Annotations ACL 2017 The Instantiation Discourse Relation: A Corpus Analysis of Its Properties and Improved Detection NAACL 2016 Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies NAACL 2016 Detecting Content-Heavy Sentences: A Cross-Language Case Study EMNLP 2015 Inducing Lexical Style Properties for Paraphrase and Genre Differentiation NAACL 2015 Identification and Characterization of Newsworthy Verbs in World News NAACL 2015 System Combination for Multi-document Summarization EMNLP 2015 Cross-lingual Discourse Relation Analysis: A corpus study and a semi-supervised classification system COLING 2014 Assessing the Discourse Factors that Influence the Quality of Machine Translation ACL 2014 Verbose, Laconic or Just Right: A Simple Computational Model of Content Appropriateness under Length Constraints EACL 2014 Improving the Estimation of Word Importance for News Multi-Document Summarization EACL 2014 A Decade of Automatic Content Evaluation of News Summaries: Reassessing the State of the Art ACL 2013 Lexical Differences in Autobiographical Narratives from Schizophrenic Patients and Healthy Controls CONLL 2012 Acoustic-Prosodic Entrainment and Social Behavior NAACL 2012 A Coherence Model Based on Syntactic Patterns EMNLP 2012 Lexical Differences in Autobiographical Narratives from Schizophrenic Patients and Healthy Controls EMNLP 2012 Proceedings of the NAACL HLT 2012 Student Research Workshop NAACL 2012 A Coherence Model Based on Syntactic Patterns CONLL 2012 Automatic Summarization ACL 2011 Automatic identification of general and specific sentences by leveraging discourse annotations IJCNLP 2011 Automatic Evaluation of Linguistic Quality in Multi-Document Summarization ACL 2010 Creating Local Coherence: An Empirical Assessment NAACL 2010 Automatically Evaluating Content Selection in Summarization without Human Models EMNLP 2009 Automatic sense prediction for implicit discourse relations in text ACL 2009 Using Syntax to Disambiguate Explicit Discourse Connectives in Text ACL 2009 Performance Confidence Estimation for Automatic Summarization EACL 2009 Using Syntax to Disambiguate Explicit Discourse Connectives in Text IJCNLP 2009 Predicting the Fluency of Text with Shallow Structural Features: Case Studies of Machine Translation and Human-Written Text EACL 2009 Automatic sense prediction for implicit discourse relations in text IJCNLP 2009 Can You Summarize This? Identifying Correlates of Input Difficulty for Multi-Document Summarization ACL 2008 Revisiting Readability: A Unified Framework for Predicting Text Quality EMNLP 2008 Tutorial Abstracts of ACL-08: HLT ACL 2008 Entity-driven Rewrite for Multi-document Summarization IJCNLP 2008 Easily Identifiable Discourse Relations COLING 2008 High Frequency Word Entrainment in Spoken Dialogue ACL 2008 Measuring Importance and Query Relevance in Topic-focused Multi-document Summarization ACL 2007 To Memorize or to Predict: Prominence labeling in Conversational Speech NAACL 2007 Automatically Learning Cognitive Status for Multi-Document Summarization of Newswire EMNLP 2005 Syntactic Simplification for Improving Content Selection in Multi-Document Summarization COLING 2004 Evaluating Content Selection in Summarization: The Pyramid Method NAACL 2004 Columbia’s Newsblaster: New Features and Future Directions NAACL 2003 References to Named Entities: a Corpus Study NAACL 2003