Jesse Thomason

44 papers · 2013–2026 · 13 conferences · across top CS/AI conferences

Achievements

+12 more ↓

🐣 Hot Topic Early Bird 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🌈 Renaissance Researcher (6) 🌍 Conference Polyglot (13)

🌉 Interdisciplinary Bridge 🗺️ Taxonomy Completionist (71) 🧭 Keyword Pioneer 🔬 Deep Specialist (18) 🌱 Topic Pioneer 🗃️ Keyword Collector (170) ⚡ Prolific Year (7) 🚀 Conference Pioneer 💎 Century Club (42) 🔥 Unstoppable (13) 📈 Trend Setter ❓ The Questioner (4)

Conferences

EMNLP (10) NAACL (9) CORL (7) IJCAI (4) ACL (3) AAAI (2) CVPR (2) EACL (2) COLING (1) IJCNLP (1) INTERSPEECH (1) NIPS (1) RSS (1)

Top co-authors

Robin Jia (7) Yonatan Bisk (6) Tejas Srinivasan (5) Wang Zhu (4) Peter Stone (4) Raymond J. Mooney (4) Abrar Anwar (4) Ting-Yun Chang (4) Aishwarya Padmakumar (4) Luke Zettlemoyer (3)

Research topics

Robotics (1) Linguistics (1)

Keywords

multimodal learning (8) vision-language model (6) embodied ai (4) large language model (4) instruction following (4) natural language understanding (3) vision-language navigation (3) visual reasoning (3) embodied agent (3) visual question answering (3) language grounding (3) reinforcement learning (2) multi-agent reinforcement learning (2) robot navigation (2) human-robot interaction (2) benchmark evaluation (2) visual navigation (2) semantic parsing (2) natural language processing (2) catastrophic forgetting (2)

Papers

Believing without Seeing: Quality Scores for Contextualizing Vision-Language Model Explanations ACL 2026 Learning to Deliberate: Meta-policy Collaboration for Agentic LLMs with Multi-agent Reinforcement Learning AAAI 2026 The American Sign Language Knowledge Graph: Infusing ASL Models with Linguistic Knowledge NAACL 2025 Language Models Can Infer Action Semantics for Symbolic Planners from Environment Feedback NAACL 2025 ReWiND: Language-Guided Rewards Teach Robot Policies without New Demonstrations CORL 2025 Efficient Evaluation of Multi-Task Robot Policies With Active Experiment Selection CORL 2025 Why Do Some Inputs Break Low-Bit LLM Quantization? EMNLP 2025 Can VLMs Recall Factual Associations From Visual References? EMNLP 2025 Large Language Models Do Multi-Label Classification Differently EMNLP 2025 Selective “Selective Prediction”: Reducing Unnecessary Abstention in Vision-Language Reasoning ACL 2024 Contrast Sets for Evaluating Language-Guided Robot Policies CORL 2024 When Parts Are Greater Than Sums: Individual LLM Components Can Outperform Full Models EMNLP 2024 Which One? Leveraging Context Between Objects and Multiple Views for Language Grounding NAACL 2024 Do Localization Methods Actually Localize Memorized Data in LLMs? A Tale of Two Benchmarks NAACL 2024 Efficient End-to-End Visual Document Understanding with Rationale Distillation NAACL 2024 WINOVIZ: Probing Visual Properties of Objects Under Different States NAACL 2024 THE COLOSSEUM: A Benchmark for Evaluating Generalization for Robotic Manipulation RSS 2024 Multimodal Speech Recognition for Language-Guided Embodied Agents INTERSPEECH 2023 Chain-of-Questions Training with Latent Answers for Robust Multistep Question Answering EMNLP 2023 Task-Attentive Transformer Architecture for Continual Learning of Vision-and-Language Tasks Using Knowledge Distillation EMNLP 2023 Improving Sign Recognition with Phonology EACL 2023 Iterative Vision-and-Language Navigation CVPR 2023 CLiMB: A Continual Learning Benchmark for Vision-and-Language Tasks NIPS 2022 ALFRED-L: Investigating the Role of Language for Action Learning in Interactive Visual Environments EMNLP 2022 Generalization Differences between End-to-End and Neuro-Symbolic Vision-Language Reasoning Systems EMNLP 2022 Vision-and-Language Navigation: A Survey of Tasks, Methods, and Future Directions ACL 2022 TEACh: Task-Driven Embodied Agents That Chat AAAI 2022 Language Grounding with 3D Objects CORL 2021 RMM: A Recursive Mental Model for Dialogue Navigation EMNLP 2020 The RobotSlang Benchmark: Dialog-guided Robot Localization and Navigation CORL 2020 ALFRED: A Benchmark for Interpreting Grounded Instructions for Everyday Tasks CVPR 2020 Experience Grounds Language EMNLP 2020 Shifting the Baseline: Single Modality Performance on Visual Navigation & QA NAACL 2019 Proceedings of the Combined Workshop on Spatial Language Understanding (SpLU) and Grounded Communication for Robotics (RoboNLP) NAACL 2019 Vision-and-Dialog Navigation CORL 2019 Multi-modal Predicate Identification using Dynamically Learned Robot Controllers IJCAI 2018 Opportunistic Active Learning for Grounding Natural Language Descriptions CORL 2017 Integrated Learning of Dialog Strategies and Semantic Parsing EACL 2017 Improving Black-box Speech Recognition using Semantic Parsing IJCNLP 2017 Multi-Modal Word Synset Induction IJCAI 2017 Learning Multi-Modal Grounded Linguistic Semantics by Playing “I Spy” IJCAI 2016 Learning to Interpret Natural Language Commands through Human-Robot Dialog IJCAI 2015 Integrating Language and Vision to Generate Natural Language Descriptions of Videos in the Wild COLING 2014 Differences in User Responses to a Wizard-of-Oz versus Automated System NAACL 2013