Yuanhan Zhang

13 papers · 2020–2026 · 7 conferences · across top CS/AI conferences

Achievements

+8 more ↓

🌈 Renaissance Researcher (5) 🌉 Interdisciplinary Bridge 🏃 Academic Marathon (5) 🌍 Conference Polyglot (6) 🗺️ Taxonomy Completionist (22)

🏃 Academic Marathon (5) 🐝 Cross-Pollinator (15) 🌈 Renaissance Researcher (5) 👥 Mega-Team (22) 🤝 Dynamic Duo (10) 💎 Century Club (12) ⚡ Prolific Year (5) ❓ The Questioner (2)

Conferences

ECCV (5) CVPR (2) NAACL (2) ACL (1) ICCV (1) ICLR (1) NIPS (1)

Top co-authors

Ziwei Liu (11) Bo Li (7) Jingkang Yang (4) Shuai Liu (3) Yuhao Dong (3) Chunyuan Li (3) Ziyue Wang (2) Fanyi Pu (2) Kaiyang Zhou (2) Sicheng Zhang (2)

Keywords

multimodal learning (5) video understanding (3) video question answering (2) benchmark evaluation (2) large language model (2) video generation (1) egocentric vision (1) human perception (1) question answering (1) few-shot learning (1) in-context learning (1) generative model (1) visual reasoning (1) efficient computing (1) preference modeling (1) direct preference optimization (1) diffusion model (1) large multimodal model (1) reward model (1) adversarial robustness (1)

Papers

Video-MMMU: Evaluating Knowledge Acquisition from Multidisciplinary Professional Videos ACL 2026 Direct Preference Optimization of Video Large Multimodal Models from Language Model Reward NAACL 2025 Towards Video Thinking Test: A Holistic Benchmark for Advanced Video Reasoning and Understanding ICCV 2025 LLaVA-Interleave: Tackling Multi-image, Video, and 3D in Large Multimodal Models ICLR 2025 LMMs-Eval: Reality Check on the Evaluation of Large Multimodal Models NAACL 2025 EgoLife: Towards Egocentric Life Assistant CVPR 2025 FunQA: Towards Surprising Video Comprehension ECCV 2024 VBench: Comprehensive Benchmark Suite for Video Generative Models CVPR 2024 Octopus: Embodied Vision-Language Programmer from Environmental Feedback ECCV 2024 MMBENCH: Is Your Multi-Modal Model an All-around Player? ECCV 2024 What Makes Good Examples for Visual In-Context Learning? NIPS 2023 Benchmarking Omni-Vision Representation through the Lens of Visual Realms ECCV 2022 CelebA-Spoof: Large-Scale Face Anti-Spoofing Dataset with Rich Annotations ECCV 2020