Dongzhan Zhou

18 papers · 2020–2026 · 9 conferences · across top CS/AI conferences

Achievements

+9 more ↓

🌉 Interdisciplinary Bridge 🌈 Renaissance Researcher (8) 🌍 Conference Polyglot (9) 🏃 Academic Marathon (5) 🗺️ Taxonomy Completionist (29)

🧭 Keyword Pioneer 🐣 Hot Topic Early Bird 🌍 Conference Polyglot (9) 🤝 Dynamic Duo (14) 🧬 Topic Evolution ⚡ Prolific Year (8) 💎 Century Club (16) 🔥 Unstoppable (6) 🗃️ Keyword Collector (74)

Conferences

AAAI (4) CVPR (3) ECCV (2) EMNLP (2) ICLR (2) WACV (2) ICCV (1) ICML (1) NAACL (1)

Top co-authors

Wanli Ouyang (15) Xinchi Zhou (5) Peng Ye (5) Di Zhang (4) Di Hu (4) Yuqiang Li (4) Jingdi Lei (3) Shufei Zhang (3) Suorong Yang (3) Hang Zhou (3)

Research topics

Applications (1)

Keywords

sound separation (2) multimodal large language model (2) neural architecture search (2) audio-visual learning (2) optical character recognition (2) self-supervised learning (1) direct preference optimization (1) benchmark evaluation (1) document understanding (1) multimodal learning (1) autonomous driving (1) point cloud (1) mathematical reasoning (1) depth estimation (1) visual reasoning (1) preference optimization (1) conditional generation (1) feature representation (1) instruction tuning (1) chain-of-thought reasoning (1)

Papers

Mitigating Low-Quality Reasoning in MLLMs: Self-Driven Refined Multimodal CoT with Selective Thinking and Step-wise Visual Enhancement AAAI 2026 Deep Research Arena: The First Exam of LLMs’ Research Abilities via Seminar-Grounded Tasks AAAI 2026 Critic-V: VLM Critics Help Catch VLM Errors in Multimodal Reasoning CVPR 2025 CMT: A Cascade MAR with Topology Predictor for Multimodal Conditional CAD Generation ICCV 2025 ChemVLM: Exploring the Power of Multimodal Large Language Models in Chemistry Area AAAI 2025 Biology-Instructions: A Dataset and Benchmark for Multi-Omics Sequence Understanding Capability of Large Language Models EMNLP 2025 MOOSE-Chem: Large Language Models for Rediscovering Unseen Chemistry Scientific Hypotheses ICLR 2025 A CLIP-Powered Framework for Robust and Generalizable Data Selection ICLR 2025 When Dynamic Data Selection Meets Data Augmentation: Achieving Enhanced Training Acceleration ICML 2025 LLaMA-Berry: Pairwise Optimization for Olympiad-level Mathematical Reasoning via O1-like Monte Carlo Tree Search NAACL 2025 LOCR: Location-Guided Transformer for Optical Character Recognition EMNLP 2024 Ref-AVS: Refer and Segment Objects in Audio-Visual Scenes ECCV 2024 Exploiting Visual Context Semantics for Sound Source Localization WACV 2023 SeCo: Separating Unknown Musical Visual Sounds With Consistency Guidance WACV 2023 SepFusion: Finding Optimal Fusion Structures for Visual Sound Separation AAAI 2022 Delving Into Localization Errors for Monocular 3D Object Detection CVPR 2021 Cheaper Pre-training Lunch: An Efficient Paradigm for Object Detection ECCV 2020 EcoNAS: Finding Proxies for Economical Neural Architecture Search CVPR 2020