Daniel Khashabi

73 papers · 2015–2026 · 13 conferences · across top CS/AI conferences

Achievements

+17 more ↓

🧭 Keyword Pioneer 🌍 Conference Polyglot (13) 🗺️ Taxonomy Completionist (12) 🌉 Interdisciplinary Bridge 🏃 Academic Marathon (10)

🐣 Hot Topic Early Bird 🗺️ Taxonomy Completionist (12) 🌈 Renaissance Researcher (8) 🏠 Conference Loyalist (20) 🤝 Dynamic Duo (13) 👑 Triple Crown 🏆 Keyword Champion 🏆 Grand Slam 👥 Mega-Team (36) 🔬 Deep Specialist (13) 🧬 Topic Evolution 📈 Trend Setter ⚡ Prolific Year (9) ❓ The Questioner (8) 💎 Century Club (71) 🔥 Unstoppable (11) 🗃️ Keyword Collector (291)

Conferences

EMNLP (20) ACL (17) NAACL (12) ICLR (5) NIPS (4) EACL (3) IJCNLP (3) AAAI (2) ICML (2) IJCAI (2) AACL (1) COLING (1) CONLL (1)

Top co-authors

Benjamin Van Durme (14) Dan Roth (13) Jingyu Zhang (12) Tushar Khot (12) Ashish Sabharwal (12) Hannaneh Hajishirzi (8) Tianjian Li (7) Yejin Choi (7) Lingfeng Shen (7) Ben Zhou (5)

Keywords

large language model (16) question answering (9) language model (8) text generation (5) machine translation (4) zero-shot learning (4) natural language understanding (3) transfer learning (3) model evaluation (3) commonsense reasoning (3) claim verification (3) instruction tuning (2) low-resource language (2) locality-sensitive hashing (2) knowledge graph (2) prompt engineering (2) few-shot learning (2) constrained generation (2) data augmentation (2) natural language processing (2)

Papers

Query Decomposition for RAG: Balancing Exploration-Exploitation EACL 2026 arXiv2Table: Toward Realistic Benchmarking and Evaluation for LLM-Based Literature-Review Table Generation ACL 2026 Benchmarking Language Model Creativity: A Case Study on Code Generation NAACL 2025 Jailbreak Distillation: Renewable Safety Benchmarking EMNLP 2025 The Translation Barrier Hypothesis: Multilingual Generation with Large Language Models Suffers from Implicit Translation Failure IJCNLP 2025 SIMPLEMIX: Frustratingly Simple Mixing of Off- and On-policy Data in Language Model Preference Learning ICML 2025 Verifiable by Design: Aligning Language Models to Quote from Pre-Training Data NAACL 2025 GenEx: Generating an Explorable World ICLR 2025 TurkingBench: A Challenge Benchmark for Web Agents NAACL 2025 Upsample or Upweight? Balanced Training on Heavily Imbalanced Datasets NAACL 2025 Evaluating the Evaluators: Are readability metrics good measures of readability? EMNLP 2025 ICL CIPHERS: Quantifying ”Learning” in In-Context Learning via Substitution Ciphers EMNLP 2025 SELF-[IN]CORRECT: LLMs Struggle with Discriminating Self-Generated Responses AAAI 2025 The Translation Barrier Hypothesis: Multilingual Generation with Large Language Models Suffers from Implicit Translation Failure AACL 2025 WorldAPIs: The World Is Worth How Many APIs? A Thought Experiment AAAI 2025 Controllable Safety Alignment: Inference-Time Adaptation to Diverse Safety Requirements ICLR 2025 Certified Mitigation of Worst-Case LLM Copyright Infringement EMNLP 2025 CLAIMCHECK: How Grounded are LLM Critiques of Scientific Papers? EMNLP 2025 RATIONALYST: Pre-training Process-Supervision for Improving Reasoning ACL 2025 Challenging the Evaluator: LLM Sycophancy Under User Rebuttal EMNLP 2025 Core: Robust Factual Precision with Informative Sub-Claim Identification ACL 2025 AnaloBench: Benchmarking the Identification of Abstract and Long-context Analogies EMNLP 2024 DiffNorm: Self-Supervised Normalization for Non-autoregressive Speech-to-speech Translation NIPS 2024 Efficient Large Multi-modal Models via Visual Context Compression NIPS 2024 RORA: Robust Free-Text Rationale Evaluation ACL 2024 k-SemStamp: A Clustering-Based Semantic Watermark for Detection of Machine-Generated Text ACL 2024 The Language Barrier: Dissecting Safety Challenges of LLMs in Multilingual Contexts ACL 2024 SemStamp: A Semantic Watermark with Paraphrastic Robustness for Text Generation NAACL 2024 GEAR: Augmenting Language Models with Generalizable and Efficient Tool Resolution EACL 2024 “According to . . . ”: Prompting Language Models Improves Quoting from Pre-Training Data EACL 2024 Position: Do pretrained Transformers Learn In-Context by Gradient Descent? ICML 2024 Error Norm Truncation: Robust Training in the Presence of Data Noise for Text Generation Models ICLR 2024 The Trickle-down Impact of Reward Inconsistency on RLHF ICLR 2024 Insights into LLM Long-Context Failures: When Transformers Know but Don’t Tell EMNLP 2024 Generating Sequences by Learning to Self-Correct ICLR 2023 Self-Instruct: Aligning Language Models with Self-Generated Instructions ACL 2023 When Not to Trust Language Models: Investigating Effectiveness of Parametric and Non-Parametric Memories ACL 2023 The Tail Wagging the Dog: Dataset Construction Biases of Social Bias Benchmarks ACL 2023 Flatness-Aware Prompt Selection Improves Accuracy and Sample Efficiency EMNLP 2023 Representation Projection Invariance Mitigates Representation Collapse EMNLP 2023 NeuroLogic A*esque Decoding: Constrained Text Generation with Lookahead Heuristics NAACL 2022 COLD Decoding: Energy-based Constrained Text Generation with Langevin Dynamics NIPS 2022 Cross-Task Generalization via Natural Language Crowdsourcing Instructions ACL 2022 Reframing Instructional Prompts to GPTk’s Language ACL 2022 Hey AI, Can You Solve Complex Tasks by Talking to Agents? ACL 2022 ProsocialDialog: A Prosocial Backbone for Conversational Agents EMNLP 2022 GENIE: Toward Reproducible and Standardized Human Evaluation for Text Generation EMNLP 2022 Prompt Waywardness: The Curious Case of Discretized Interpretation of Continuous Prompts NAACL 2022 Time Waits for No One! Analysis and Challenges of Temporal Misalignment NAACL 2022 Findings of the 2021 Conference on Machine Translation (WMT21) EMNLP 2021 GooAQ: Open Question Answering with Diverse Answer Types EMNLP 2021 Ethical-Advice Taker: Do Language Models Understand Natural Language Interventions? ACL 2021 Ethical-Advice Taker: Do Language Models Understand Natural Language Interventions? IJCNLP 2021 Text Modular Networks: Learning to Decompose Tasks in the Language of Existing Models NAACL 2021 UNQOVERing Stereotyping Biases via Underspecified Questions EMNLP 2020 UNIFIEDQA: Crossing Format Boundaries with a Single QA System EMNLP 2020 Evaluating Models’ Local Decision Boundaries via Contrast Sets EMNLP 2020 More Bang for Your Buck: Natural Perturbation for Robust Question Answering EMNLP 2020 TransOMCS: From Linguistic Graphs to Commonsense Knowledge IJCAI 2020 Temporal Common Sense Acquisition with Minimal Supervision ACL 2020 Not All Claims are Created Equal: Choosing the Right Statistical Approach to Assess Hypotheses ACL 2020 “Going on a vacation” takes longer than “Going for a walk”: A Study of Temporal Commonsense Understanding EMNLP 2019 “Going on a vacation” takes longer than “Going for a walk”: A Study of Temporal Commonsense Understanding IJCNLP 2019 Seeing Things from a Different Angle:Discovering Diverse Perspectives about Claims NAACL 2019 PerspectroScope: A Window to the World of Diverse Perspectives ACL 2019 Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop ACL 2019 Looking Beyond the Surface: A Challenge Set for Reading Comprehension over Multiple Sentences NAACL 2018 Zero-Shot Open Entity Typing as Type-Compatible Grounding EMNLP 2018 Learning What is Essential in Questions CONLL 2017 Better call Saul: Flexible Programming for Learning and Inference in NLP COLING 2016 Question Answering via Integer Programming over Semi-Structured Knowledge IJCAI 2016 Solving Hard Coreference Problems NAACL 2015 Online Learning with Adversarial Delays NIPS 2015