← Authors

Biao Gong

21 papers · 2023–2025 · 6 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓

+4 more ↓

🐣 Hot Topic Early Bird 🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🌍 Conference Polyglot (6) 🐝 Cross-Pollinator (8)

🗺️ Taxonomy Completionist (34) 💎 Century Club (21) ⚡ Prolific Year (9) 🗃️ Keyword Collector (97)

Conferences

CVPR (10) ICCV (4) NIPS (3) ICLR (2) AAAI (1) ECCV (1)

Top co-authors

Yujun Shen (9) Kecheng Zheng (8) Yutong Feng (7) Shuai Tan (5) Jingdong Chen (5) Ming Yang (5) Yuyuan Li (5) Yiliang Lv (4) Shiwei Zhang (4) Siteng Huang (4)

Keywords

diffusion model (5) generative model (3) action recognition (2) text-to-image generation (2) contrastive learning (2) multimodal learning (2) vision-language model (2) image generation (2) foundation model (2) video generation (2) image captioning (1) self-supervised learning (1) temporal modeling (1) image synthesis (1) motion estimation (1) instruction following (1) benchmark evaluation (1) cross-modal retrieval (1) temporal grounding (1) transfer learning (1)

Papers

Animate-X: Universal Character Image Animation with Enhanced Motion Representation ICLR 2025 ResMaster: Mastering High-Resolution Image Generation via Structural and Fine-Grained Guidance AAAI 2025 MotionStone: Decoupled Motion Intensity Modulation with Diffusion Transformer for Image-to-Video Generation CVPR 2025 Learning Visual Generative Priors without Text CVPR 2025 Benchmarking Large Vision-Language Models via Directed Scene Graph for Comprehensive Image Captioning CVPR 2025 Mimir: Improving Video Diffusion Models for Precise Text Understanding CVPR 2025 DreamRelation: Relation-Centric Video Customization ICCV 2025 ObjectRelator: Enabling Cross-View Object Relation Understanding Across Ego-Centric and Exo-Centric Perspectives ICCV 2025 Framer: Interactive Frame Interpolation ICLR 2025 Ranni: Taming Text-to-Image Diffusion for Accurate Instruction Following CVPR 2024 StyleTokenizer: Defining Image Style by a Single Instance for Controlling Diffusion Models ECCV 2024 A Recipe for Scaling up Text-to-Video Generation with Text-free Videos CVPR 2024 Learning Disentangled Identifiers for Action-Customized Text-to-Image Generation CVPR 2024 Troika: Multi-Path Cross-Modal Traction for Compositional Zero-Shot Learning CVPR 2024 Accelerating Pre-training of Multimodal LLMs via Chain-of-Sight NIPS 2024 CURE4Rec: A Benchmark for Recommendation Unlearning with Deeper Influence NIPS 2024 UKnow: A Unified Knowledge Protocol with Multimodal Knowledge Graph Datasets for Reasoning and Vision-Language Pre-Training NIPS 2024 Check Locate Rectify: A Training-Free Layout Calibration System for Text-to-Image Generation CVPR 2024 ViM: Vision Middleware for Unified Downstream Transferring ICCV 2023 Scanning Only Once: An End-to-end Framework for Fast Temporal Grounding in Long Videos ICCV 2023 VoP: Text-Video Co-Operative Prompt Tuning for Cross-Modal Retrieval CVPR 2023