Zeyuan Chen

26 papers · 2021–2026 · 11 conferences · across top CS/AI conferences

Achievements

+8 more ↓

🧭 Keyword Pioneer 🐝 Cross-Pollinator (6) 🌍 Conference Polyglot (11) 🏃 Academic Marathon (5) 🌈 Renaissance Researcher (7)

🌉 Interdisciplinary Bridge 🗺️ Taxonomy Completionist (51) 🌍 Conference Polyglot (11) 🏆 Keyword Champion (2) 💎 Century Club (26) 🗃️ Keyword Collector (111) 🔥 Unstoppable (6) ⚡ Prolific Year (7)

Conferences

ICCV (6) CVPR (5) ECCV (4) AAAI (3) WACV (2) ACL (1) COLING (1) CORL (1) ICLR (1) NAACL (1) NIPS (1)

Top co-authors

Ran Xu (9) Zhuowen Tu (8) Xiang Zhang (6) Caiming Xiong (5) Haiyang Xu (4) Can Qin (4) Yihao Feng (3) Ning Yu (3) You Xie (2) Huan Wang (2)

Research topics

Techniques (1)

Keywords

diffusion model (5) 3d reconstruction (3) large language model (3) video generation (2) multimodal learning (2) attention mechanism (2) image generation (2) depth estimation (2) unsupervised learning (2) relevance modeling (2) preference optimization (2) shape reconstruction (2) scene understanding (1) bayesian inference (1) policy optimization (1) pose estimation (1) image restoration (1) autoregressive transformer (1) domain adaptation (1) frame interpolation (1)

Papers

Gaussian Swaying: Surface-Based Framework for Aerodynamic Simulation with 3D Gaussians WACV 2026 CVP: Central-Peripheral Vision-Inspired Multimodal Model for Spatial Reasoning WACV 2026 YOLO-Count: Differentiable Object Counting for Text-to-Image Generation ICCV 2025 X-Dancer: Expressive Music to Human Dance Video Generation ICCV 2025 ClutterDexGrasp: A Sim-to-Real System for General Dexterous Grasping in Cluttered Scenes CORL 2025 CPRM: A LLM-based Continual Pre-training Framework for Relevance Modeling in Commercial Search NAACL 2025 Towards Boosting LLMs-driven Relevance Modeling with Progressive Retrieved Behavior-augmented Prompting COLING 2025 DepR: Depth Guided Single-view Scene Reconstruction with Instance-level Diffusion ICCV 2025 Structured Policy Optimization: Enhance Large Vision-Language Model via Self-referenced Dialogue ICCV 2025 X-Dyna: Expressive Dynamic Human Image Animation CVPR 2025 SQ-LLaVA: Self-Questioning for Large Vision-Language Assistant ECCV 2024 BLIVA: A Simple Multimodal LLM for Better Handling of Text-Rich Visual Questions AAAI 2024 Bayesian Diffusion Models for 3D Shape Reconstruction CVPR 2024 HIVE: Harnessing Human Feedback for Instructional Visual Editing CVPR 2024 LayoutDETR: Detection Transformer Is a Good Multimodal Layout Designer ECCV 2024 Dolfin: Diffusion Layout Transformers without Autoencoder ECCV 2024 Retroformer: Retrospective Large Language Agents with Policy Gradient Optimization ICLR 2024 Uni-3D: A Universal Model for Panoptic 3D Scene Reconstruction ICCV 2023 Tackling Data Heterogeneity in Federated Learning with Class Prototypes AAAI 2023 GlueGen: Plug and Play Multi-modal Encoders for X-to-image Generation ICCV 2023 VideoINR: Learning Video Implicit Neural Representation for Continuous Space-Time Super-Resolution CVPR 2022 Field Extraction from Forms with Unlabeled Data ACL 2022 CASA: Category-agnostic Skeletal Animal Reconstruction NIPS 2022 Burn after Reading: Online Adaptation for Cross-Domain Streaming Data ECCV 2022 PSD: Principled Synthetic-to-Real Dehazing Guided by Physical Priors CVPR 2021 Graph-Based Tri-Attention Network for Answer Ranking in CQA AAAI 2021