Danqi Chen

75 papers · 2013–2025 · 8 conferences · across top CS/AI conferences

Achievements

+18 more ↓

🗺️ Taxonomy Completionist (12) 🧭 Keyword Pioneer 🌈 Renaissance Researcher (5) 🌉 Interdisciplinary Bridge 🌍 Conference Polyglot (8)

🌈 Renaissance Researcher (5) 🌉 Interdisciplinary Bridge 🐣 Hot Topic Early Bird 🏠 Conference Loyalist (26) 🌟 Keyword Trendsetter Combo (9) 🤝 Dynamic Duo (17) 👑 Triple Crown 🌱 Topic Pioneer 🏆 Keyword Champion (2) 👥 Mega-Team (22) 🔬 Deep Specialist (12) 🧬 Topic Evolution 📈 Trend Setter ⚡ Prolific Year (16) 🗃️ Keyword Collector (256) 💎 Century Club (75) 🔥 Unstoppable (13) ❓ The Questioner (2)

Conferences

EMNLP (26) ACL (17) ICLR (9) ICML (7) NIPS (7) NAACL (5) IJCNLP (3) EACL (1)

Top co-authors

Tianyu Gao (17) Mengzhou Xia (15) Alexander Wettig (13) Zexuan Zhong (11) Dan Friedman (9) Yangsibo Huang (7) Sanjeev Arora (7) Howard Yen (7) Sadhika Malladi (6) Howard Chen (6)

Keywords

open-domain question answering (8) information retrieval (7) in-context learning (7) dense retrieval (7) question answering (7) language model (7) large language model (6) few-shot learning (5) dense retriever (3) masked language model (3) named entity recognition (3) benchmark evaluation (3) pre-trained language model (3) machine reading comprehension (3) text classification (3) transfer learning (3) natural language inference (3) prompt engineering (3) language model fine-tuning (3) phrase retrieval (3)

Papers

Representing Rule-based Chatbots with Transformers NAACL 2025 Organize the Web: Constructing Domains Enhances Pre-Training Data Curation ICML 2025 Query-Focused Retrieval Heads Improve Long-Context Reasoning and Re-ranking EMNLP 2025 How to Train Long-Context Language Models (Effectively) ACL 2025 HELMET: How to Evaluate Long-context Models Effectively and Thoroughly ICLR 2025 BRIGHT: A Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval ICLR 2025 Unintentional Unalignment: Likelihood Displacement in Direct Preference Optimization ICLR 2025 Fantastic Copyrighted Beasts and How (Not) to Generate Them ICLR 2025 SORRY-Bench: Systematically Evaluating Large Language Model Safety Refusal ICLR 2025 Metadata Conditioning Accelerates Language Model Pre-training ICML 2025 SimPO: Simple Preference Optimization with a Reference-Free Reward NIPS 2024 LESS: Selecting Influential Data for Targeted Instruction Tuning ICML 2024 QuRating: Selecting High-Quality Data for Training Language Models ICML 2024 Interpretability Illusions in the Generalization of Simplified Models ICML 2024 Long-Context Language Modeling with Parallel Context Encoding ACL 2024 The Heuristic Core: Understanding Subnetwork Generalization in Pretrained Language Models ACL 2024 Language Models as Science Tutors ICML 2024 Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning ICLR 2024 Evaluating Large Language Models at Evaluating Instruction Following ICLR 2024 Catastrophic Jailbreak of Open-source LLMs via Exploiting Generation ICLR 2024 Detecting Pretraining Data from Large Language Models ICLR 2024 LitSearch: A Retrieval Benchmark for Scientific Literature Search EMNLP 2024 Finding Transformer Circuits With Edge Pruning NIPS 2024 CharXiv: Charting Gaps in Realistic Chart Understanding in Multimodal LLMs NIPS 2024 Should You Mask 15% in Masked Language Modeling? EACL 2023 Adapting Language Models to Compress Contexts EMNLP 2023 A Kernel-Based View of Language Model Fine-Tuning ICML 2023 MQuAKE: Assessing Knowledge Editing in Language Models via Multi-Hop Questions EMNLP 2023 Privacy Implications of Retrieval-Based Language Models EMNLP 2023 Poisoning Retrieval Corpora by Injecting Adversarial Passages EMNLP 2023 Fine-Tuning Language Models with Just Forward Passes NIPS 2023 Measuring Inductive Biases of In-Context Learning with Underspecified Demonstrations ACL 2023 Training Trajectories of Language Models Across Scales ACL 2023 Retrieval-based Language Models and Applications ACL 2023 Optimizing Test-Time Query Representations for Dense Retrieval ACL 2023 What In-Context Learning “Learns” In-Context: Disentangling Task Recognition and Task Learning ACL 2023 MoQA: Benchmarking Multi-Type Open-Domain Question Answering ACL 2023 Enabling Large Language Models to Generate Text with Citations EMNLP 2023 C-STS: Conditional Semantic Textual Similarity EMNLP 2023 Learning Transformer Programs NIPS 2023 Generating Natural Language Proofs with Verifier-Guided Search EMNLP 2022 Recovering Private Text in Federated Learning of Language Models NIPS 2022 Structured Pruning Learns Compact and Accurate Models ACL 2022 Ditch the Gold Standard: Re-evaluating Conversational Question Answering ACL 2022 Finding Dataset Shortcuts with Grammar Induction EMNLP 2022 Training Language Models with Memory Augmentation EMNLP 2022 Don’t Prompt, Search! Mining-based Zero-Shot Learning with Language Models EMNLP 2022 MABEL: Attenuating Gender Bias using Textual Entailment Data EMNLP 2022 Prompting ELECTRA: Few-Shot Learning with Discriminative Pre-Trained Models EMNLP 2022 Can Rationalization Improve Robustness? NAACL 2022 SimCSE: Simple Contrastive Learning of Sentence Embeddings EMNLP 2021 Simple Entity-Centric Questions Challenge Dense Retrievers EMNLP 2021 Single-dataset Experts for Multi-dataset Question Answering EMNLP 2021 Phrase Retrieval Learns Passage Retrieval, Too EMNLP 2021 Learning Dense Representations of Phrases at Scale ACL 2021 Making Pre-trained Language Models Better Few-shot Learners ACL 2021 Making Pre-trained Language Models Better Few-shot Learners IJCNLP 2021 Learning Dense Representations of Phrases at Scale IJCNLP 2021 A Frustratingly Easy Approach for Entity and Relation Extraction NAACL 2021 Non-Parametric Few-Shot Learning for Word Sense Disambiguation NAACL 2021 Factual Probing Is [MASK]: Learning vs. Learning to Recall NAACL 2021 Open-Domain Question Answering ACL 2020 Dense Passage Retrieval for Open-Domain Question Answering EMNLP 2020 TextHide: Tackling Data Privacy in Language Understanding Tasks EMNLP 2020 A Discrete Hard EM Approach for Weakly Supervised Question Answering EMNLP 2019 A Discrete Hard EM Approach for Weakly Supervised Question Answering IJCNLP 2019 MRQA 2019 Shared Task: Evaluating Generalization in Reading Comprehension EMNLP 2019 Proceedings of the 2nd Workshop on Machine Reading for Question Answering EMNLP 2019 Proceedings of the Workshop on Machine Reading for Question Answering ACL 2018 Reading Wikipedia to Answer Open-Domain Questions ACL 2017 Position-aware Attention and Supervised Data Improve Slot Filling EMNLP 2017 A Thorough Examination of the CNN/Daily Mail Reading Comprehension Task ACL 2016 Representing Text for Joint Embedding of Text and Knowledge Bases EMNLP 2015 A Fast and Accurate Dependency Parser using Neural Networks EMNLP 2014 Reasoning With Neural Tensor Networks for Knowledge Base Completion NIPS 2013