Yizhuo Li

14 papers · 2020–2026 · 6 conferences · across top CS/AI conferences

Achievements

+7 more ↓

🌉 Interdisciplinary Bridge 🏃 Academic Marathon (5) 🌍 Conference Polyglot (6) 🌈 Renaissance Researcher (7) 🗺️ Taxonomy Completionist (40)

🌉 Interdisciplinary Bridge 🐣 Hot Topic Early Bird 🗺️ Taxonomy Completionist (40) 🧬 Topic Evolution 🗃️ Keyword Collector (71) 💎 Century Club (13) 🔥 Unstoppable (6)

Conferences

CVPR (4) ICCV (3) AAAI (2) ACL (2) NIPS (2) ICLR (1)

Top co-authors

Cewu Lu (5) Yinan He (4) Yu Qiao (4) Kunchang Li (4) Yali Wang (4) Yi Wang (4) Bo Pang (4) Limin Wang (4) Yixiao Ge (2) Guo Chen (2)

Keywords

video understanding (4) object detection (3) large language model (2) action recognition (2) convolutional neural network (2) self-supervised learning (2) temporal modeling (1) video recognition (1) semantic segmentation (1) video generation (1) vision transformer (1) video classification (1) computer vision (1) attention mechanism (1) multimodal learning (1) in-context learning (1) test-time adaptation (1) multi-modal learning (1) pose estimation (1) benchmark evaluation (1)

Papers

OMIBench: Benchmarking Olympiad-Level Multi-Image Reasoning in Large Vision-Language Models ACL 2026 AnRe: Analogical Replay for Temporal Knowledge Graph Forecasting ACL 2025 Moto: Latent Motion Token as the Bridging Language for Learning Robot Manipulation from Videos ICCV 2025 Divot: Diffusion Powers Video Tokenizer for Comprehension and Generation CVPR 2025 MVBench: A Comprehensive Multi-modal Video Understanding Benchmark CVPR 2024 InternVid: A Large-scale Video-Text Dataset for Multimodal Understanding and Generation ICLR 2024 UniFormerV2: Unlocking the Potential of Image ViTs for Video Understanding ICCV 2023 Unmasked Teacher: Towards Training-Efficient Video Foundation Models ICCV 2023 Unsupervised Representation for Semantic Segmentation by Implicit Cycle-Attention Contrastive Learning AAAI 2022 PGT: A Progressive Method for Training Models on Long Videos CVPR 2021 TDAF: Top-Down Attention Framework for Vision Tasks AAAI 2021 Test-Time Personalization with a Transformer for Human Pose Estimation NIPS 2021 HOI Analysis: Integrating and Decomposing Human-Object Interaction NIPS 2020 TubeTK: Adopting Tubes to Track Multi-Object in a One-Step Training Model CVPR 2020