Yinan He
16 papers · 2021–2025 · 5 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+9 more ↓ Show less ↑
πΊοΈ Taxonomy Completionist (29) π Renaissance Researcher (5) π Interdisciplinary Bridge π Conference Polyglot (5) π§ Keyword Pioneer
π§
Keyword Pioneer
π£
Hot Topic Early Bird
π€
Dynamic Duo
(14)
π₯
Mega-Team
(38)
β‘
Prolific Year
(6)
ποΈ
Keyword Collector
(69)
π₯
Unstoppable
(5)
π
Century Club
(16)
β
The Questioner
Conferences
CVPR (6)
ICCV (4)
ECCV (3)
ICLR (2)
NIPS (1)
Top co-authors
Keywords
video understanding
(4)
zero-shot learning
(2)
large language model
(2)
diffusion model
(2)
benchmark evaluation
(2)
vision transformer
(2)
masked autoencoder
(2)
image restoration
(1)
multi-task learning
(1)
semantic segmentation
(1)
video classification
(1)
self-supervised learning
(1)
weakly supervised learning
(1)
in-context learning
(1)
video recognition
(1)
video generation
(1)
multi-modal learning
(1)
multimodal learning
(1)
question answering
(1)
temporal reasoning
(1)
Papers
Task Preference Optimization: Improving Multimodal Large Language Models with Vision Task Alignment
CVPR 2025
WISNet: Pseudo Label Generation on Unbalanced and Patch Annotated Waste Images
CVPR 2025
DiffVSR: Revealing an Effective Recipe for Taming Robust Video Super-Resolution Against Complex Degradations
ICCV 2025
OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text
ICLR 2025
VRBench: A Benchmark for Multi-Step Reasoning in Long Narrative Videos
ICCV 2025
VideoMamba: State Space Model for Efficient Video Understanding
ECCV 2024
MVBench: A Comprehensive Multi-modal Video Understanding Benchmark
CVPR 2024
VBench: Comprehensive Benchmark Suite for Video Generative Models
CVPR 2024
Does Video-Text Pretraining Help Open-Vocabulary Online Action Detection?
NIPS 2024
InternVideo2: Scaling Foundation Models for Multimodal Video Understanding
ECCV 2024
InternVid: A Large-scale Video-Text Dataset for Multimodal Understanding and Generation
ICLR 2024
Unmasked Teacher: Towards Training-Efficient Video Foundation Models
ICCV 2023
VideoMAE V2: Scaling Video Masked Autoencoders With Dual Masking
CVPR 2023
UniFormerV2: Unlocking the Potential of Image ViTs for Video Understanding
ICCV 2023
X-Learner: Learning Cross Sources and Tasks for Universal Visual Representation
ECCV 2022
ForgeryNet: A Versatile Benchmark for Comprehensive Forgery Analysis
CVPR 2021