Ziqiao Ma

18 papers · 2022–2025 · 9 conferences · across top CS/AI conferences

Achievements

+8 more ↓

🐝 Cross-Pollinator (8) 🌈 Renaissance Researcher (6) 🧭 Keyword Pioneer 🌍 Conference Polyglot (9) 🌉 Interdisciplinary Bridge

🌍 Conference Polyglot (9) 🐣 Hot Topic Early Bird 👥 Mega-Team (24) 🤝 Dynamic Duo (16) 💎 Century Club (18) ⚡ Prolific Year (5) 🗃️ Keyword Collector (92) ❓ The Questioner (2)

Conferences

ACL (4) EMNLP (4) CVPR (2) NAACL (2) NIPS (2) CORL (1) ICCV (1) ICLR (1) IJCAI (1)

Top co-authors

Joyce Chai (16) Yichi Zhang (4) Sihan Xu (3) Jiayi Pan (3) Jianing Yang (3) Freda Shi (3) Yidong Huang (3) Jayjun Lee (2) Parisa Kordjamshidi (2) Yinpei Dai (2)

Research topics

Education (1)

Keywords

multimodal learning (4) vision-language model (4) benchmark evaluation (3) large language model (3) diffusion model (3) dialogue system (3) object hallucination (2) theory of mind (2) instruction following (2) multimodal large language model (2) video generation (1) autonomous driving (1) egocentric vision (1) multi-modal learning (1) code generation (1) natural language processing (1) computational linguistics (1) empirical study (1) cognitive modeling (1) embodied ai (1)

Papers

Babysit A Language Model From Scratch: Interactive Language Learning by Trials and Demonstrations NAACL 2025 From Behavioral Performance to Internal Competence: Interpreting Vision-Language Models with VLM-Lens EMNLP 2025 VEGGIE: Instructional Editing and Reasoning Video Concepts with Grounded Generation ICCV 2025 Do Vision-Language Models Represent Space and How? Evaluating Spatial Frame of Reference under Ambiguities ICLR 2025 AimBot: A Simple Auxiliary Visual Cue to Enhance Spatial Awareness of Visuomotor Policies CORL 2025 Training Turn-by-Turn Verifiers for Dialogue Tutoring Agents: The Curious Case of LLMs as Your Coding Tutors ACL 2025 Do Vision-Language Models Have Internal World Models? Towards an Atomic Evaluation ACL 2025 Learning Language through Grounding NAACL 2025 Inversion-Free Image Editing with Language-Guided Diffusion Models CVPR 2024 Multi-Object Hallucination in Vision Language Models NIPS 2024 GROUNDHOG: Grounding Large Language Models to Holistic Segmentation CVPR 2024 CycleNet: Rethinking Cycle Consistency in Text-Guided Diffusion for Image Manipulation NIPS 2023 World-to-Words: Grounded Open Vocabulary Acquisition through Fast Mapping in Vision-Language Models ACL 2023 NLP Reproducibility For All: Understanding Experiences of Beginners ACL 2023 Towards A Holistic Landscape of Situated Theory of Mind in Large Language Models EMNLP 2023 Towards Collaborative Plan Acquisition through Theory of Mind Modeling in Situated Dialogue IJCAI 2023 DOROTHIE: Spoken Dialogue for Handling Unexpected Situations in Interactive Autonomous Driving Agents EMNLP 2022 DANLI: Deliberative Agent for Following Natural Language Instructions EMNLP 2022