Yujie Zhong

31 papers · 2020–2026 · 9 conferences · across top CS/AI conferences

Achievements

+11 more ↓

🌉 Interdisciplinary Bridge 🌈 Renaissance Researcher (6) 🌍 Conference Polyglot (9) 🏃 Academic Marathon (5) 🗺️ Taxonomy Completionist (51)

🗺️ Taxonomy Completionist (51) 🧭 Keyword Pioneer 🐣 Hot Topic Early Bird 🤝 Dynamic Duo (14) 🧬 Topic Evolution 🏆 Keyword Champion (2) 🔬 Deep Specialist (10) 🗃️ Keyword Collector (143) ⚡ Prolific Year (8) 💎 Century Club (30) 🔥 Unstoppable (6)

Conferences

CVPR (10) ICCV (9) ECCV (4) AAAI (3) AACL (1) ACL (1) EMNLP (1) ICLR (1) IJCNLP (1)

Top co-authors

Lin Ma (14) Chengjian Feng (9) Weilin Huang (6) Sheng Guo (4) Yingsen Zeng (4) Dengjie Li (4) Zequn Jie (4) Jie Hu (3) Weidi Xie (3) Qiong Cao (3)

Keywords

contrastive learning (5) object detection (5) instance segmentation (3) video understanding (3) representation learning (3) semantic segmentation (3) vision-language model (3) autonomous driving (3) self-supervised learning (3) large language model (2) action recognition (2) cross-modal correspondence (2) visual large language model (2) open relation extraction (2) temporal representation (2) multi-task learning (2) visual language model (2) metric learning (2) frame sampling (2) transfer learning (1)

Papers

ViType: High-Fidelity Visual Text Rendering via Glyph-Aware Multimodal Diffusion AAAI 2026 Mr. DETR: Instructive Multi-Route Training for Detection Transformers CVPR 2025 CO-MOT: Boosting End-to-end Transformer-based Multi-Object Tracking via Coopetition Label Assignment and Shadow Sets ICLR 2025 Advancing Visual Large Language Model for Multi-granular Versatile Perception ICCV 2025 DisTime: Distribution-based Time Representation for Video Large Language Models ICCV 2025 RoboTron-Sim: Improving Real-World Driving via Simulated Hard-Case ICCV 2025 Layer-wise Vision Injection with Disentangled Attention for Efficient LVLMs ICCV 2025 HyperSeg: Hybrid Segmentation Assistant with Fine-grained Visual Perceiver CVPR 2025 RoboTron-Drive: All-in-One Large Multimodal Model for Autonomous Driving ICCV 2025 v-CLR: View-Consistent Learning for Open-World Instance Segmentation CVPR 2025 InstructSeg: Unifying Instructed Visual Segmentation with Multi-modal Large Language Models ICCV 2025 InstaGen: Enhancing Object Detection by Training on Synthetic Dataset CVPR 2024 Intelligent Grimm - Open-ended Visual Storytelling via Latent Diffusion Models CVPR 2024 When Phrases Meet Probabilities: Enabling Open Relation Extraction with Cooperating Large Language Models ACL 2024 UniMD: Towards Unifying Moment Retrieval and Temporal Action Detection ECCV 2024 Open-Vocabulary Semantic Segmentation with Decoupled One-Pass Network ICCV 2023 Adaptive Sparse Pairwise Loss for Object Re-Identification CVPR 2023 TriDet: Temporal Action Detection With Relative Boundary Modeling CVPR 2023 AeDet: Azimuth-Invariant Multi-View 3D Object Detection CVPR 2023 PromptDet: Towards Open-Vocabulary Detection Using Uncurated Images ECCV 2022 InsCLR: Improving Instance Retrieval with Self-Supervision AAAI 2022 Contrastive Video-Language Learning with Fine-grained Frame Sampling AACL 2022 DearKD: Data-Efficient Early Knowledge Distillation for Vision Transformers CVPR 2022 Cross-Architecture Self-Supervised Video Representation Learning CVPR 2022 ReAct: Temporal Action Detection with Relational Queries ECCV 2022 MatchPrompt: Prompt-based Open Relation Extraction with Semantic Consistency Guided Clustering EMNLP 2022 Contrastive Video-Language Learning with Fine-grained Frame Sampling IJCNLP 2022 Exploring Classification Equilibrium in Long-Tailed Object Detection ICCV 2021 Unchain the Search Space with Hierarchical Differentiable Architecture Search AAAI 2021 TOOD: Task-Aligned One-Stage Object Detection ICCV 2021 Representation Sharing for Fast Object Detector Search and Beyond ECCV 2020