Yihan Zeng

14 papers · 2022–2026 · 6 conferences · across top CS/AI conferences

Achievements

+8 more ↓

🗺️ Taxonomy Completionist (29) 🌈 Renaissance Researcher (5) 🌍 Conference Polyglot (6) 🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer

🧭 Keyword Pioneer 🐣 Hot Topic Early Bird 🤝 Dynamic Duo (10) 👥 Mega-Team (30) ⚡ Prolific Year (5) 💎 Century Club (13) 🔥 Unstoppable (5) 🗃️ Keyword Collector (66)

Conferences

CVPR (4) ICCV (3) AAAI (2) ECCV (2) ICLR (2) EMNLP (1)

Top co-authors

Hang Xu (10) Wei Zhang (5) Songcen Xu (5) Xiaodan Liang (5) Wangmeng Zuo (4) Dit-Yan Yeung (4) Jianhua Han (4) Tianyu Huang (3) Guansong Lu (3) ChenHan Jiang (3)

Keywords

vision-language model (2) multimodal learning (2) video diffusion (2) text-to-3d generation (2) image generation (2) diffusion model (2) cross-modal alignment (2) multimodal large language model (2) video diffusion prior (2) model robustness (1) autonomous driving (1) emotion recognition (1) point cloud (1) cross-modal learning (1) video generation (1) few-shot learning (1) semantic alignment (1) speech processing (1) transfer learning (1) bird's eye view (1)

Papers

CoherenDream: Boosting Holistic Text Coherence in 3D Generation via Multimodal Large Language Models Feedback AAAI 2026 Corrupted but Not Broken: Understanding and Mitigating the Negative Impacts of Corrupted Data in Visual Instruction Tuning EMNLP 2025 EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions CVPR 2025 DreamPhysics: Learning Physics-Based 3D Dynamics with Video Diffusion Priors AAAI 2025 FramePainter: Endowing Interactive Image Editing with Video Diffusion Priors ICCV 2025 UniGS: Unified Language-Image-3D Pretraining with Gaussian Splatting ICLR 2025 PanGu-Draw: Advancing Resource-Efficient Text-to-Image Synthesis with Time-Decoupled Training and Reusable Coop-Diffusion ECCV 2024 TextField3D: Towards Enhancing Open-Vocabulary 3D Generation with Noisy Text Fields ICLR 2024 DreamControl: Control-Based Text-to-3D Generation with 3D Self-Prior CVPR 2024 JointDreamer: Ensuring Geometry Consistency and Text Congruence in Text-to-3D Generation via Joint Score Distillation ECCV 2024 CLIP2: Contrastive Language-Image-Point Pretraining From Real-World Point Cloud Data CVPR 2023 DiffDis: Empowering Generative Diffusion Model with Cross-Modal Discrimination Capability ICCV 2023 Towards High-Fidelity Text-Guided 3D Face Generation and Manipulation Using only Images ICCV 2023 LIFT: Learning 4D LiDAR Image Fusion Transformer for 3D Object Detection CVPR 2022