Joyce Chai

56 papers · 2002–2026 · 12 conferences · across top CS/AI conferences

Achievements

+13 more ↓

🐝 Cross-Pollinator (8) 🌍 Conference Polyglot (12) 🧭 Keyword Pioneer 🏃 Academic Marathon (23) 🌈 Renaissance Researcher (9)

🌈 Renaissance Researcher (9) 🌉 Interdisciplinary Bridge 🗺️ Taxonomy Completionist (79) 🏠 Conference Loyalist (20) 🤝 Dynamic Duo (16) 🏆 Keyword Champion (2) 🔬 Deep Specialist (14) 🗃️ Keyword Collector (209) 🔥 Unstoppable (6) 📈 Trend Setter ❓ The Questioner (4) ⚡ Prolific Year (11) 💎 Century Club (55)

Conferences

EMNLP (20) ACL (15) NAACL (6) CVPR (4) IJCNLP (3) NIPS (2) COLING (1) CORL (1) ICCV (1) ICLR (1) IJCAI (1) WACV (1)

Top co-authors

Ziqiao Ma (16) Shane Storks (10) Yichi Zhang (10) Parisa Kordjamshidi (6) Jianing Yang (6) Qiaozi Gao (5) Shaohua Yang (4) Guangyue Xu (4) Yuwei Bao (4) Zheyuan Zhang (4)

Research topics

Linguistics (1) Education (1)

Keywords

multimodal learning (8) large language model (5) vision-language model (4) dialogue system (4) benchmark evaluation (3) embodied ai (3) diffusion model (3) compositional concept (3) visual grounding (3) language understanding (3) theory of mind (3) visual reasoning (3) zero-shot learning (3) multimodal large language model (2) natural language processing (2) vision language model (2) embodied agent (2) interactive learning (2) few-shot learning (2) reinforcement learning (2)

Papers

Sparse Feature Coactivation Reveals Causal Semantic Modules in Large Language Models ACL 2026 AimBot: A Simple Auxiliary Visual Cue to Enhance Spatial Awareness of Visuomotor Policies CORL 2025 Fast3R: Towards 3D Reconstruction of 1000+ Images in One Forward Pass CVPR 2025 Training Turn-by-Turn Verifiers for Dialogue Tutoring Agents: The Curious Case of LLMs as Your Coding Tutors ACL 2025 Learning Language through Grounding NAACL 2025 Babysit A Language Model From Scratch: Interactive Language Learning by Trials and Demonstrations NAACL 2025 Do Vision-Language Models Represent Space and How? Evaluating Spatial Frame of Reference under Ambiguities ICLR 2025 VEGGIE: Instructional Editing and Reasoning Video Concepts with Grounded Generation ICCV 2025 Benchmarking and Improving LLM Robustness for Personalized Generation EMNLP 2025 Transparent and Coherent Procedural Mistake Detection EMNLP 2025 3D-GRAND: A Million-Scale Dataset for 3D-LLMs with Better Grounding and Less Hallucination CVPR 2025 Proactive Assistant Dialogue Generation from Streaming Egocentric Videos EMNLP 2025 Eliciting In-Context Learning in Vision-Language Models for Videos Through Curated Data Distributional Properties EMNLP 2024 Inversion-Free Image Editing with Language-Guided Diffusion Models CVPR 2024 Teaching Embodied Reinforcement Learning Agents: Informativeness and Diversity of Language Use EMNLP 2024 Multi-Object Hallucination in Vision Language Models NIPS 2024 GIPCOL: Graph-Injected Soft Prompting for Compositional Zero-Shot Learning WACV 2024 GROUNDHOG: Grounding Large Language Models to Holistic Segmentation CVPR 2024 MetaReVision: Meta-Learning with Retrieval for Visually Grounded Compositional Concept Acquisition EMNLP 2023 World-to-Words: Grounded Open Vocabulary Acquisition through Fast Mapping in Vision-Language Models ACL 2023 In-Context Analogical Reasoning with Pre-Trained Language Models ACL 2023 NLP Reproducibility For All: Understanding Experiences of Beginners ACL 2023 Human Inspired Progressive Alignment and Comparative Learning for Grounded Word Acquisition ACL 2023 Grounding Visual Illusions in Language: Do Vision-Language Models Perceive Illusions Like Humans? EMNLP 2023 From Heuristic to Analytic: Cognitively Motivated Strategies for Coherent Physical Commonsense Reasoning EMNLP 2023 Towards A Holistic Landscape of Situated Theory of Mind in Large Language Models EMNLP 2023 CycleNet: Rethinking Cycle Consistency in Text-Guided Diffusion for Image Manipulation NIPS 2023 Can Foundation Models Watch, Talk and Guide You Step by Step to Make a Cake? EMNLP 2023 Towards Collaborative Plan Acquisition through Theory of Mind Modeling in Situated Dialogue IJCAI 2023 Learning to Mediate Disparities Towards Pragmatic Communication ACL 2022 DANLI: Deliberative Agent for Following Natural Language Instructions EMNLP 2022 DOROTHIE: Spoken Dialogue for Handling Unexpected Situations in Interactive Autonomous Driving Agents EMNLP 2022 Tiered Reasoning for Intuitive Physics: Toward Verifiable Commonsense Language Understanding EMNLP 2021 Zero-Shot Compositional Concept Learning ACL 2021 Hierarchical Task Learning from Language Instructions with Unified Transformers and Self-Monitoring ACL 2021 Hierarchical Task Learning from Language Instructions with Unified Transformers and Self-Monitoring IJCNLP 2021 Zero-Shot Compositional Concept Learning IJCNLP 2021 MindCraft: Theory of Mind Modeling for Situated Dialogue in Collaborative Tasks EMNLP 2021 Beyond the Tip of the Iceberg: Assessing Coherence of Text Classifiers EMNLP 2021 Experience Grounds Language EMNLP 2020 Commonsense Justification for Action Explanation EMNLP 2018 What Action Causes This? Towards Naive Physical Action-Effect Prediction ACL 2018 Interactive Learning of Grounded Verb Semantics towards Human-Robot Communication ACL 2017 Incremental Acquisition of Verb Hypothesis Space towards Physical World Interaction ACL 2016 Physical Causality of Action Verbs in Grounded Language Understanding ACL 2016 Jointly Learning Grounded Task Structures from Language Instruction and Visual Demonstration EMNLP 2016 Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies NAACL 2015 Autonomous Self-Assessment of Autocorrections: Exploring Text Message Dialogues NAACL 2012 Beyond Normalization: Pragmatics of Word Form in Text Messages IJCNLP 2011 Beyond NomBank: A Study of Implicit Arguments for Nominal Predicates ACL 2010 Towards Conversation Entailment: An Empirical Investigation EMNLP 2010 The Role of Implicit Argumentation in Nominal SRL NAACL 2009 Incorporating Temporal and Semantic Information with Eye Gaze for Automatic Word Acquisition in Multimodal Conversational Systems EMNLP 2008 An Exploration of Eye Gaze in Spoken Language Processing for Multimodal Conversational Interfaces NAACL 2007 Automated Vocabulary Acquisition and Interpretation in Multimodal Conversational Systems ACL 2007 Semantics-based Representation for Multimodal Interpretation in Conversational Systems COLING 2002