Yuankai Qi
35 papers · 2016–2026 · 10 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+13 more ↓ Show less ↑
π Interdisciplinary Bridge π Renaissance Researcher (10) π Academic Marathon (9) π Conference Polyglot (9) πΊοΈ Taxonomy Completionist (70)
πΊοΈ
Taxonomy Completionist
(70)
π§
Keyword Pioneer
π£
Hot Topic Early Bird
π¬
Deep Specialist
(14)
π€
Dynamic Duo
(13)
π§¬
Topic Evolution
π
Keyword Champion
(3)
π
Conference Pioneer
π
Century Club
(32)
π₯
Unstoppable
(8)
ποΈ
Keyword Collector
(158)
π
Trend Setter
β‘
Prolific Year
(8)
Conferences
CVPR (16)
AAAI (7)
ICCV (4)
ECCV (2)
ACL (1)
EACL (1)
IJCAI (1)
MICCAI (1)
NAACL (1)
NIPS (1)
Top co-authors
Keywords
vision-language navigation
(8)
speech synthesis
(6)
multimodal learning
(5)
movie dubbing
(5)
convolutional neural network
(3)
visual tracking
(3)
object tracking
(3)
contrastive learning
(3)
vision-and-language navigation
(3)
cross-modal alignment
(3)
voice cloning
(3)
multi-modal learning
(3)
embodied ai
(2)
ensemble learning
(2)
zero-shot learning
(2)
visual grounding
(2)
diffusion model
(2)
medical imaging
(2)
referring expression
(2)
video captioning
(2)
Papers
InstructDubber: Instruction-based Alignment for Zero-shot Movie Dubbing
AAAI 2026
Tracking the Unstable: Appearance-Guided Motion Modeling for Robust Multi-Object Tracking in UAV-Captured Videos
AAAI 2026
The Devil is in the Distributions: Explicit Modeling of Scene Content is Key in Zero-Shot Video Captioning
EACL 2026
Incomplete Multi-View Multi-Label Classification via Diffusion-Guided Redundancy Removal
AAAI 2025
Generating Synthetic Data for Unsupervised Federated Learning of Cross-Modal Retrieval
AAAI 2025
Seeing the Trees for the Forest: Rethinking Weakly-Supervised Medical Visual Grounding
ICCV 2025
EmoDubber: Towards High Quality and Emotion Controllable Movie Dubbing
CVPR 2025
Visual and Semantic Prompt Collaboration for Generalized Zero-Shot Learning
CVPR 2025
Separation of Powers: On Segregating Knowledge from Observation in LLM-enabled Knowledge-based Visual Question Answering
CVPR 2025
Prosody-Enhanced Acoustic Pre-training and Acoustic-Disentangled Prosody Adapting for Movie Dubbing
CVPR 2025
Medusa: A Multi-Scale High-order Contrastive Dual-Diffusion Approach for Multi-View Clustering
CVPR 2025
Weakly Supervised Video Individual Counting
CVPR 2024
Augmented Commonsense Knowledge for Remote Object Grounding
AAAI 2024
StyleDubber: Towards Multi-Scale Style Learning for Movie Dubbing
ACL 2024
Decomposing Disease Descriptions for Enhanced Pathology Detection: A Multi-Aspect Vision-Language Pre-training Framework
CVPR 2024
Generating Content for HDR Deghosting from Frequency View
CVPR 2024
Structural Attention: Rethinking Transformer for Unpaired Medical Image Synthesis
MICCAI 2024
March in Chat: Interactive Prompting for Remote Embodied Referring Expression
ICCV 2023
Learning To Dub Movies via Hierarchical Prosody Models
CVPR 2023
Exploiting Completeness and Uncertainty of Pseudo Labels for Weakly Supervised Video Anomaly Detection
CVPR 2023
AerialVLN: Vision-and-Language Navigation for UAVs
ICCV 2023
V2C: Visual Voice Cloning
CVPR 2022
HOP: History-and-Order Aware Pre-Training for Vision-and-Language Navigation
CVPR 2022
Diagnosing Vision-and-Language Navigation: What Really Matters
NAACL 2022
Hierarchical Modular Network for Video Captioning
CVPR 2022
VLN BERT: A Recurrent Vision-and-Language BERT for Navigation
CVPR 2021
The Road To Know-Where: An Object-and-Room Informed Sequential BERT for Indoor Vision-Language Navigation
ICCV 2021
Release the Power of Online-Training for Robust Visual Tracking
AAAI 2020
Object-and-Action Aware Model for Visual Language Navigation
ECCV 2020
REVERIE: Remote Embodied Visual Referring Expression in Real Indoor Environments
CVPR 2020
Language and Visual Entity Relationship Graph for Agent Navigation
NIPS 2020
High Performance Gesture Recognition via Effective and Efficient Temporal Modeling
IJCAI 2019
Learning Attribute-Specific Representations for Visual Tracking
AAAI 2019
The Unmanned Aerial Vehicle Benchmark: Object Detection and Tracking
ECCV 2018
Hedged Deep Tracking
CVPR 2016