Mark Dredze

116 papers · 2007–2026 · 10 conferences · across top CS/AI conferences

Achievements

+18 more ↓

🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🌈 Renaissance Researcher (5) 🗺️ Taxonomy Completionist (21) 🐣 Hot Topic Early Bird

🌈 Renaissance Researcher (5) 🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🌟 Keyword Trendsetter Combo (6) 🏠 Conference Loyalist (31) 🏆 Keyword Champion 🌱 Topic Pioneer 🤝 Dynamic Duo (15) 👥 Mega-Team (27) 🔬 Deep Specialist (15) 🧬 Topic Evolution ⚡ Prolific Year (10) 🔥 Unstoppable (19) 📈 Trend Setter 🗃️ Keyword Collector (319) 💎 Century Club (115) ❓ The Questioner (11) 🚀 Conference Pioneer

Conferences

ACL (33) EMNLP (33) NAACL (31) IJCNLP (5) COLING (4) CONLL (3) NIPS (3) EACL (2) AISTATS (1) JMLR (1)

Top co-authors

Benjamin Van Durme (15) Nicholas Andrews (12) Shijie Wu (9) Travis Wolfe (8) Nanyun Peng (7) Mo Yu (7) Koby Crammer (7) Fernando Pereira (7) Carlos Aguirre (7) Matthew R. Gormley (7)

Keywords

large language model (12) text classification (12) social media analysis (8) cross-lingual transfer (7) zero-shot learning (7) named entity recognition (7) transfer learning (5) social media (5) part-of-speech tagging (4) multi-task learning (4) online learning (4) language model (4) text generation (3) dependency parsing (3) depression detection (3) few-shot learning (3) entity linking (3) in-context learning (3) mental health detection (3) confidence-weighted learning (3)

Papers

Domain Generalizable AI Guardrails with Augmented Policy Training ACL 2026 A Novel Multi-Document Retrieval Benchmark: Journalist Source-Selection in Newswriting NAACL 2025 Making FETCH! Happen: Finding Emergent Dog Whistles Through Common Habitats ACL 2025 RAG LLMs are Not Safer: A Safety Analysis of Retrieval-Augmented Generation for Large Language Models NAACL 2025 Benchmarking Large Language Models on Answering and Explaining Challenging Medical Questions NAACL 2025 Amuro & Char: Analyzing the Relationship between Pre-Training and Fine-Tuning of Large Language Models NAACL 2025 DnDScore: Decontextualization and Decomposition for Factuality Verification in Long-Form Text Generation EMNLP 2025 LLMs are Better Than You Think: Label-Guided In-Context Learning for Named Entity Recognition EMNLP 2025 Evaluating the Evaluators: Are readability metrics good measures of readability? EMNLP 2025 A Closer Look at Claim Decomposition NAACL 2024 Multi-Task Transfer Matters During Instruction-Tuning ACL 2024 Selecting Shots for Demographic Fairness in Few-Shot Learning with Large Language Models EMNLP 2024 Transferring Fairness using Multi-Task Learning with Limited Demographic Information EMNLP 2024 Can We Statically Locate Knowledge in Large Language Models? Financial Domain and Toxicity Reduction Case Studies EMNLP 2024 Schema-Driven Information Extraction from Heterogeneous Tables EMNLP 2024 Evaluating Biases in Context-Dependent Sexual and Reproductive Health Questions EMNLP 2024 Gender Bias in Decision-Making with Large Language Models: A Study of Relationship Conflicts EMNLP 2024 Do LLMs Plan Like Human Writers? Comparing Journalist Coverage of Press Releases with LLMs EMNLP 2024 Academics Can Contribute to Domain-Specialized Language Models EMNLP 2024 On the Surprising Effectiveness of Name Matching Alone in Autoregressive Entity Linking ACL 2023 Geo-Seq2seq: Twitter User Geolocation on Noisy Data through Sequence to Sequence Learning ACL 2023 MixCE: Training Autoregressive Language Models by Mixing Forward and Reverse Cross-Entropies ACL 2023 Characterization of Stigmatizing Language in Medical Records ACL 2023 Joint End-to-end Semantic Proto-role Labeling ACL 2023 Strength in Numbers: Estimating Confidence of Large Language Models by Prompt Agreement ACL 2023 Explaining Models of Mental Health via Clinically Grounded Auxiliary Tasks NAACL 2022 Updated Headline Generation: Creating Updated Summaries for Evolving News Stories ACL 2022 Model Distillation for Faithful Explanations of Medical Code Predictions ACL 2022 Zero-shot Cross-lingual Transfer is Under-specified Optimization ACL 2022 Zero-shot Cross-Language Transfer of Monolingual Entity Linking Models EMNLP 2022 What Makes Data-to-Text Generation Hard for Pretrained Language Models? EMNLP 2022 Do Text-to-Text Multi-Task Learners Suffer from Task Conflict? EMNLP 2022 Bernice: A Multilingual Pre-trained Encoder for Twitter EMNLP 2022 Changes in Tweet Geolocation over Time: A Study with Carmen 2.0 COLING 2022 Then and Now: Quantifying the Longitudinal Validity of Self-Disclosed Depression Diagnoses NAACL 2022 Towards Understanding the Role of Gender in Deploying Social Media-Based Mental Health Surveillance Models NAACL 2021 Qualitative Analysis of Depression Models by Demographics NAACL 2021 On the State of Social Media Data for Mental Health Research NAACL 2021 Cross-Lingual Transfer in Zero-Shot Cross-Language Entity Linking ACL 2021 Using Noisy Self-Reports to Predict Twitter User Demographics NAACL 2021 Everything Is All It Takes: A Multipronged Strategy for Zero-Shot Cross-Lingual Information Extraction EMNLP 2021 Fine-tuning Encoders for Improved Monolingual and Zero-shot Polylingual Neural Topic Modeling NAACL 2021 Study of Manifestation of Civil Unrest on Twitter EMNLP 2021 Gender and Racial Fairness in Depression Research using Social Media EACL 2021 User Factor Adaptation for User Embedding via Multitask Learning EACL 2021 Cross-Lingual Transfer in Zero-Shot Cross-Language Entity Linking IJCNLP 2021 Clinical Concept Linking with Contextualized Neural Representations ACL 2020 Do Explicit Alignments Robustly Improve Multilingual Encoders? EMNLP 2020 Civil Unrest on Twitter (CUT): A Dataset of Tweets to Support Research on Civil Unrest EMNLP 2020 Do Models of Mental Health Based on Social Media Data Generalize? EMNLP 2020 Are All Languages Created Equal in Multilingual BERT? ACL 2020 Sources of Transfer in Multilingual Named Entity Recognition ACL 2020 Beto, Bentz, Becas: The Surprising Cross-Lingual Effectiveness of BERT EMNLP 2019 Beto, Bentz, Becas: The Surprising Cross-Lingual Effectiveness of BERT IJCNLP 2019 Mental Health Surveillance over Social Media with Digital Cohorts NAACL 2019 Predicting Twitter User Demographics from Names Alone NAACL 2018 Challenges of Using Text Classifiers for Causal Inference EMNLP 2018 Using Author Embeddings to Improve Tweet Stance Classification EMNLP 2018 Convolutions Are All You Need (For Classifying Character Sequences) EMNLP 2018 Deep Dirichlet Multinomial Regression NAACL 2018 Johns Hopkins or johnny-hopkins: Classifying Individuals versus Organizations on Twitter NAACL 2018 Proceedings of ACL 2017, Student Research Workshop ACL 2017 CADET: Computer Assisted Discovery Extraction and Translation IJCNLP 2017 Bayesian Modeling of Lexical Resources for Low-Resource Settings ACL 2017 Pocket Knowledge Base Population ACL 2017 Improving Named Entity Recognition for Chinese Social Media with Word Segmentation Representation Learning ACL 2016 Geolocation for Twitter: Timing Matters NAACL 2016 Embedding Lexical Features via Low-Rank Tensors NAACL 2016 Learning Multiview Embeddings of Twitter Users ACL 2016 An Empirical Study of Chinese Name Matching and Applications ACL 2015 An Empirical Study of Chinese Name Matching and Applications IJCNLP 2015 FrameNet+: Fast Paraphrastic Tripling of FrameNet IJCNLP 2015 Predicate Argument Alignment using a Global Coherence Model NAACL 2015 Entity Linking for Spoken Language NAACL 2015 Combining Word Embeddings and Feature Embeddings for Fine-grained Relation Extraction NAACL 2015 A Concrete Chinese NLP Pipeline NAACL 2015 FrameNet+: Fast Paraphrastic Tripling of FrameNet ACL 2015 Improved Relation Extraction with Feature-Rich Compositional Embedding Models EMNLP 2015 Named Entity Recognition for Chinese Social Media with Jointly Trained Embeddings EMNLP 2015 Learning Polylingual Topic Models from Code-Switched Social Media Documents ACL 2014 Robust Entity Clustering via Phylogenetic Inference ACL 2014 Low-Resource Semantic Role Labeling ACL 2014 Improving Lexical Embeddings with Semantic Knowledge ACL 2014 PARMA: A Predicate Argument Aligner ACL 2013 Drug Extraction from the Web: Summarizing Drug Experiences with Multi-Dimensional Topic Models NAACL 2013 What’s in a Domain? Multi-Domain Learning for Multi-Attribute Data NAACL 2013 Separating Fact from Fear: Tracking Flu Infections on Twitter NAACL 2013 Broadly Improving User Classification via Communication-Based Name and Location Clustering on Twitter NAACL 2013 Topic Models and Metadata for Visualizing Text Corpora NAACL 2013 Confidence-Weighted Linear Classification for Text Categorization JMLR 2012 Multi-Domain Learning: When Do Domains Matter? EMNLP 2012 Name Phylogeny: A Generative Model of String Variation CONLL 2012 Shared Components Topic Models NAACL 2012 Fast Syntactic Analysis for Statistical Language Modeling via Substructure Sharing and Uptraining ACL 2012 Factorial LDA: Sparse Multi-Dimensional Text Models NIPS 2012 Entity Clustering Across Languages NAACL 2012 Revisiting the Case for Explicit Syntactic Information in Language Models NAACL 2012 Multi-Domain Learning: When Do Domains Matter? CONLL 2012 Name Phylogeny: A Generative Model of String Variation EMNLP 2012 Learning Sub-Word Units for Open Vocabulary Speech Recognition ACL 2011 Exploiting Feature Covariance in High-Dimensional Online Learning AISTATS 2010 We’re Not in Kansas Anymore: Detecting Domain Changes in Streams EMNLP 2010 NLP on Spoken Documents Without ASR EMNLP 2010 Streaming Cross Document Entity Coreference Resolution COLING 2010 Entity Disambiguation for Knowledge Base Population COLING 2010 Contextual Information Improves OOV Detection in Speech NAACL 2010 Multi-Class Confidence Weighted Algorithms EMNLP 2009 Adaptive Regularization of Weight Vectors NIPS 2009 Online Methods for Multi-Domain Learning and Adaptation EMNLP 2008 Reading the Markets: Forecasting Public Opinion of Political Candidates by News Analysis COLING 2008 Icelandic Data Driven Part of Speech Tagging ACL 2008 Active Learning with Confidence ACL 2008 Exact Convex Confidence-Weighted Learning NIPS 2008 Frustratingly Hard Domain Adaptation for Dependency Parsing CONLL 2007 Frustratingly Hard Domain Adaptation for Dependency Parsing EMNLP 2007 Biographies, Bollywood, Boom-boxes and Blenders: Domain Adaptation for Sentiment Classification ACL 2007