Jonathan May

98 papers · 2006–2026 · 12 conferences · across top CS/AI conferences

Achievements

+17 more ↓

🧭 Keyword Pioneer 🗺️ Taxonomy Completionist (16) 🌉 Interdisciplinary Bridge 🌈 Renaissance Researcher (5) 🌍 Conference Polyglot (12)

🌉 Interdisciplinary Bridge 🗺️ Taxonomy Completionist (16) 🧭 Keyword Pioneer 🌟 Keyword Trendsetter Combo (3) 🏠 Conference Loyalist (28) 🤝 Dynamic Duo (15) 👥 Mega-Team (20) 🔬 Deep Specialist (16) 🧬 Topic Evolution 🏆 Keyword Champion ❓ The Questioner (9) 💎 Century Club (95) 📈 Trend Setter 🚀 Conference Pioneer 🔥 Unstoppable (11) 🗃️ Keyword Collector (368) ⚡ Prolific Year (6)

Conferences

ACL (29) EMNLP (28) NAACL (20) IJCNLP (7) SEMEVAL (4) CONLL (2) EACL (2) NIPS (2) COLING (1) ICLR (1) IJCAI (1) INTERSPEECH (1)

Top co-authors

Kevin Knight (15) Nanyun Peng (11) Heng Ji (11) Alexander Spangher (9) Thamme Gowda (7) Xuezhe Ma (7) Hyundong Cho (7) Xiang Ren (5) Hyundong Justin Cho (5) Muhao Chen (5)

Research topics

Digital Humanities (2) Linguistics (1)

Keywords

cross-lingual transfer (10) transfer learning (10) language model (9) large language model (8) low-resource language (8) event extraction (6) question answering (6) neural machine translation (5) named entity recognition (5) dialogue system (5) machine translation (5) few-shot learning (4) text generation (4) text classification (4) reinforcement learning (4) multilingual model (4) entity linking (3) information extraction (3) commonsense reasoning (3) word embedding (3)

Papers

A Representation Sharpening Framework for Zero Shot Dense Retrieval EACL 2026 Uncovering Intervention Opportunities for Suicide Prevention with Language Model Assistants ACL 2026 GTA: Generating Long-horizon Tasks for Web Agents at Scale ACL 2026 Learning to Rewrite Negation Queries in Product Search COLING 2025 Personalized Help for Optimizing Low-Skilled Users’ Strategy NAACL 2025 Style Transfer with Multi-iteration Preference Optimization NAACL 2025 A Little Human Data Goes A Long Way ACL 2025 The Million Authors Corpus: A Cross-Lingual and Cross-Domain Wikipedia Dataset for Authorship Verification ACL 2025 Can Vision Language Models Understand Mimed Actions? ACL 2025 Can VLMs Recall Factual Associations From Visual References? EMNLP 2025 Teaching Language Models To Gather Information Proactively EMNLP 2025 Tuning-Free Personalized Alignment via Trial-Error-Explain In-Context Learning NAACL 2025 R2D2: Remembering, Replaying and Dynamic Decision Making with a Reflective Agentic Memory ACL 2025 NewsInterview: a Dataset and a Playground to Evaluate LLMs’ Grounding Gap via Informational Interviews ACL 2025 Should I Trust You? Detecting Deception in Negotiations using Counterfactual RL ACL 2025 LegalDiscourse: Interpreting When Laws Apply and To Whom NAACL 2024 Speechworthy Instruction-tuned Language Models EMNLP 2024 Are Large Language Models Capable of Generating Human-Level Narratives? EMNLP 2024 BotEval: Facilitating Interactive Human Evaluation ACL 2024 More Victories, Less Cooperation: Assessing Cicero’s Diplomacy Play ACL 2024 Leitner-Guided Memory Replay for Cross-lingual Continual Learning NAACL 2024 Can Language Model Moderators Improve the Health of Online Discourse? NAACL 2024 Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length NIPS 2024 GPT is Not an Annotator: The Necessity of Human Annotation in Fairness Benchmark Construction ACL 2024 Tracking the Newsworthiness of Public Documents ACL 2024 Explaining Mixtures of Sources in News Articles EMNLP 2024 Identifying Informational Sources in News Articles EMNLP 2023 Mega: Moving Average Equipped Gated Attention ICLR 2023 Cross-lingual Continual Learning ACL 2023 RECAP: Retrieval-Enhanced Context-Aware Prefix Encoder for Personalized Dialogue Response Generation ACL 2023 WinoQueer: A Community-in-the-Loop Benchmark for Anti-LGBTQ+ Bias in Large Language Models ACL 2023 Know Where You’re Going: Meta-Learning for Parameter-Efficient Fine-Tuning ACL 2023 Bridging the Gap between Native Text and Translated Text through Adversarial Learning: A Case Study on Cross-Lingual Event Extraction EACL 2023 Challenges in Context-Aware Neural Machine Translation EMNLP 2023 Continual Dialogue State Tracking via Example-Guided Question Answering EMNLP 2023 Analyzing Norm Violations in Live-Stream Chat EMNLP 2023 Machine Translation Robustness to Natural Asemantic Variation EMNLP 2022 Segmenting Numerical Substitution Ciphers EMNLP 2022 Know Thy Strengths: Comprehensive Dialogue State Tracking Diagnostics EMNLP 2022 Investigating the Benefits of Free-Form Rationales EMNLP 2022 NewsEdits: A News Article Revision Dataset and a Novel Document-Level Reasoning Challenge NAACL 2022 Augmenting Training Data for Massive Semantic Matching Models in Low-Traffic E-commerce Stores NAACL 2022 Opponent Modeling in Negotiation Dialogues by Related Data Adaptation NAACL 2022 Building an Event Extractor with Only a Few Examples NAACL 2022 Cross-Attention is All You Need: Adapting Pretrained Transformers for Machine Translation EMNLP 2021 WARP: Word-level Adversarial ReProgramming IJCNLP 2021 Can Sequence-to-Sequence Models Crack Substitution Ciphers? IJCNLP 2021 Many-to-English Machine Translation Tools, Data, and Pretrained Models IJCNLP 2021 Summary-Oriented Question Generation for Informational Queries ACL 2021 Many-to-English Machine Translation Tools, Data, and Pretrained Models ACL 2021 Can Sequence-to-Sequence Models Crack Substitution Ciphers? ACL 2021 WARP: Word-level Adversarial ReProgramming ACL 2021 Summary-Oriented Question Generation for Informational Queries IJCNLP 2021 Salience-Aware Event Chain Modeling for Narrative Understanding EMNLP 2021 Luna: Linear Unified Nested Attention NIPS 2021 X-METRA-ADA: Cross-lingual Meta-Transfer learning Adaptation to Natural Language Understanding and Question Answering NAACL 2021 CaSiNo: A Corpus of Campsite Negotiation Dialogues for Automatic Negotiation Systems NAACL 2021 Multitask Semi-Supervised Learning for Class-Imbalanced Discourse Classification EMNLP 2021 Macro-Average: Rare Types Are Important Too NAACL 2021 Learning to Generalize for Sequential Decision Making EMNLP 2020 Finding the Optimal Vocabulary Size for Neural Machine Translation EMNLP 2020 Experience Grounds Language EMNLP 2020 Connecting the Dots: Event Graph Schema Induction with Path Language Modeling EMNLP 2020 Grounding Conversations with Improvised Dialogues ACL 2020 Enabling Low-Resource Transfer Learning across COVID-19 Corpora by Combining Event-Extraction and Co-Training ACL 2020 What Matters for Neural Cross-Lingual Named Entity Recognition: An Empirical Analysis IJCNLP 2019 Contextualized Cross-Lingual Event Trigger Extraction with Minimal Resources CONLL 2019 Cross-lingual Structure Transfer for Relation and Event Extraction IJCNLP 2019 Do Nuclear Submarines Have Nuclear Captains? A Challenge Dataset for Commonsense Reasoning over Adjectives and Objects IJCNLP 2019 A Grounded Unsupervised Universal Part-of-Speech Tagger for Low-Resource Languages NAACL 2019 Cross-lingual Multi-Level Adversarial Transfer to Enhance Low-Resource Name Tagging NAACL 2019 Proceedings of the 13th International Workshop on Semantic Evaluation SEMEVAL 2019 SARAL: A Low-Resource Cross-Lingual Domain-Focused Information Retrieval System for Effective Rapid Document Triage ACL 2019 Translating Translationese: A Two-Step Approach to Unsupervised Machine Translation ACL 2019 Cross-lingual Structure Transfer for Relation and Event Extraction EMNLP 2019 Do Nuclear Submarines Have Nuclear Captains? A Challenge Dataset for Commonsense Reasoning over Adjectives and Objects EMNLP 2019 What Matters for Neural Cross-Lingual Named Entity Recognition: An Empirical Analysis EMNLP 2019 Cross-lingual Joint Entity and Word Embedding to Improve Entity Linking and Parallel Sentence Mining EMNLP 2019 Recurrent Neural Networks as Weighted Language Recognizers NAACL 2018 Proceedings of the 12th International Workshop on Semantic Evaluation SEMEVAL 2018 Translating a Language You Don’t Know In the Chinese Room ACL 2018 Towards Controllable Story Generation NAACL 2018 Out-of-the-box Universal Romanization Tool uroman ACL 2018 ELISA-EDL: A Cross-lingual Entity Extraction, Linking and Localization System NAACL 2018 SemEval-2017 Task 9: Abstract Meaning Representation Parsing and Generation SEMEVAL 2017 Cross-lingual Name Tagging and Linking for 282 Languages ACL 2017 Team ELISA System for DARPA LORELEI Speech Evaluation 2016 INTERSPEECH 2017 Transfer Learning for Low-Resource Neural Machine Translation EMNLP 2016 Simple, Fast Noise-Contrastive Estimation for Large RNN Vocabularies NAACL 2016 SemEval-2016 Task 8: Meaning Representation Parsing SEMEVAL 2016 Parsing English into Abstract Meaning Representation Using Syntax-Based Machine Translation EMNLP 2015 Identifying Useful Human Correction Feedback from an On-line Machine Translation Service IJCAI 2013 Models of Translation Competitions ACL 2013 Tuning as Ranking EMNLP 2011 Efficient Inference through Cascades of Weighted Tree Transducers ACL 2010 Syntactic Re-Alignment Models for Machine Translation EMNLP 2007 Syntactic Re-Alignment Models for Machine Translation CONLL 2007 A Better N-Best List: Practical Determinization of Weighted Finite Tree Automata NAACL 2006