Yuankai Qi

35 papers · 2016–2026 · 10 conferences · across top CS/AI conferences

Achievements

+13 more ↓

🌉 Interdisciplinary Bridge 🌈 Renaissance Researcher (10) 🏃 Academic Marathon (9) 🌍 Conference Polyglot (9) 🗺️ Taxonomy Completionist (70)

🗺️ Taxonomy Completionist (70) 🧭 Keyword Pioneer 🐣 Hot Topic Early Bird 🔬 Deep Specialist (14) 🤝 Dynamic Duo (13) 🧬 Topic Evolution 🏆 Keyword Champion (3) 🚀 Conference Pioneer 💎 Century Club (32) 🔥 Unstoppable (8) 🗃️ Keyword Collector (158) 📈 Trend Setter ⚡ Prolific Year (8)

Conferences

CVPR (16) AAAI (7) ICCV (4) ECCV (2) ACL (1) EACL (1) IJCAI (1) MICCAI (1) NAACL (1) NIPS (1)

Top co-authors

Qi Wu (13) Qingming Huang (12) Anton van den Hengel (10) Ming-Hsuan Yang (10) Guorong Li (6) Liang Li (6) Yicong Hong (5) Qi Chen (4) Gaoxiang Cong (4) Minh-Son To (3)

Keywords

vision-language navigation (8) speech synthesis (6) multimodal learning (5) movie dubbing (5) convolutional neural network (3) visual tracking (3) object tracking (3) contrastive learning (3) vision-and-language navigation (3) cross-modal alignment (3) voice cloning (3) multi-modal learning (3) embodied ai (2) ensemble learning (2) zero-shot learning (2) visual grounding (2) diffusion model (2) medical imaging (2) referring expression (2) video captioning (2)

Papers

InstructDubber: Instruction-based Alignment for Zero-shot Movie Dubbing AAAI 2026 Tracking the Unstable: Appearance-Guided Motion Modeling for Robust Multi-Object Tracking in UAV-Captured Videos AAAI 2026 The Devil is in the Distributions: Explicit Modeling of Scene Content is Key in Zero-Shot Video Captioning EACL 2026 Incomplete Multi-View Multi-Label Classification via Diffusion-Guided Redundancy Removal AAAI 2025 Generating Synthetic Data for Unsupervised Federated Learning of Cross-Modal Retrieval AAAI 2025 Seeing the Trees for the Forest: Rethinking Weakly-Supervised Medical Visual Grounding ICCV 2025 EmoDubber: Towards High Quality and Emotion Controllable Movie Dubbing CVPR 2025 Visual and Semantic Prompt Collaboration for Generalized Zero-Shot Learning CVPR 2025 Separation of Powers: On Segregating Knowledge from Observation in LLM-enabled Knowledge-based Visual Question Answering CVPR 2025 Prosody-Enhanced Acoustic Pre-training and Acoustic-Disentangled Prosody Adapting for Movie Dubbing CVPR 2025 Medusa: A Multi-Scale High-order Contrastive Dual-Diffusion Approach for Multi-View Clustering CVPR 2025 Weakly Supervised Video Individual Counting CVPR 2024 Augmented Commonsense Knowledge for Remote Object Grounding AAAI 2024 StyleDubber: Towards Multi-Scale Style Learning for Movie Dubbing ACL 2024 Decomposing Disease Descriptions for Enhanced Pathology Detection: A Multi-Aspect Vision-Language Pre-training Framework CVPR 2024 Generating Content for HDR Deghosting from Frequency View CVPR 2024 Structural Attention: Rethinking Transformer for Unpaired Medical Image Synthesis MICCAI 2024 March in Chat: Interactive Prompting for Remote Embodied Referring Expression ICCV 2023 Learning To Dub Movies via Hierarchical Prosody Models CVPR 2023 Exploiting Completeness and Uncertainty of Pseudo Labels for Weakly Supervised Video Anomaly Detection CVPR 2023 AerialVLN: Vision-and-Language Navigation for UAVs ICCV 2023 V2C: Visual Voice Cloning CVPR 2022 HOP: History-and-Order Aware Pre-Training for Vision-and-Language Navigation CVPR 2022 Diagnosing Vision-and-Language Navigation: What Really Matters NAACL 2022 Hierarchical Modular Network for Video Captioning CVPR 2022 VLN BERT: A Recurrent Vision-and-Language BERT for Navigation CVPR 2021 The Road To Know-Where: An Object-and-Room Informed Sequential BERT for Indoor Vision-Language Navigation ICCV 2021 Release the Power of Online-Training for Robust Visual Tracking AAAI 2020 Object-and-Action Aware Model for Visual Language Navigation ECCV 2020 REVERIE: Remote Embodied Visual Referring Expression in Real Indoor Environments CVPR 2020 Language and Visual Entity Relationship Graph for Agent Navigation NIPS 2020 High Performance Gesture Recognition via Effective and Efficient Temporal Modeling IJCAI 2019 Learning Attribute-Specific Representations for Visual Tracking AAAI 2019 The Unmanned Aerial Vehicle Benchmark: Object Detection and Tracking ECCV 2018 Hedged Deep Tracking CVPR 2016