Xitong Yang
19 papers · 2015–2025 · 5 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+8 more ↓ Show less ↑
π Interdisciplinary Bridge π Conference Polyglot (5) π Academic Marathon (10) π Renaissance Researcher (5) πΊοΈ Taxonomy Completionist (36)
π£
Hot Topic Early Bird
π
Conference Polyglot
(5)
π
Academic Marathon
(10)
π₯
Mega-Team
(100)
π
Century Club
(19)
ποΈ
Keyword Collector
(79)
β‘
Prolific Year
(5)
π₯
Unstoppable
(7)
Conferences
CVPR (11)
ECCV (4)
ICCV (2)
ICML (1)
NIPS (1)
Top co-authors
Research topics
Keywords
video understanding
(6)
action recognition
(5)
egocentric video
(3)
weakly supervised learning
(3)
video captioning
(2)
video recognition
(2)
weakly-supervised learning
(2)
long video
(2)
multiple instance learning
(2)
multimodal learning
(2)
video generation
(1)
object detection
(1)
vision transformer
(1)
zero-shot learning
(1)
curriculum learning
(1)
temporal dynamics
(1)
metric learning
(1)
entity linking
(1)
temporal reasoning
(1)
pose estimation
(1)
Papers
Progress-Aware Video Frame Captioning
CVPR 2025
GenRec: Unifying Video Generation and Recognition with Diffusion Models
NIPS 2024
Learning to Segment Referred Objects from Narrated Egocentric Videos
CVPR 2024
Video ReCap: Recursive Captioning of Hour-Long Videos
CVPR 2024
Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives
CVPR 2024
"Propose, Assess, Search: Harnessing LLMs for Goal-Oriented Planning in Instructional Videos"
ECCV 2024
Vision Transformers Are Good Mask Auto-Labelers
CVPR 2023
Relational Space-Time Query in Long-Form Videos
CVPR 2023
Open-VCLIP: Transforming CLIP to an Open-vocabulary Video Model via Interpolated Weight Optimization
ICML 2023
Towards Scalable Neural Representation for Diverse Videos
CVPR 2023
ASM-Loc: Action-Aware Segment Modeling for Weakly-Supervised Temporal Action Localization
CVPR 2022
Semi-Supervised Vision Transformers
ECCV 2022
Efficient Video Transformers with Spatial-Temporal Token Selection
ECCV 2022
Beyond Short Clips: End-to-End Video-Level Learning With Collaborative Memories
CVPR 2021
A Generic Visualization Approach for Convolutional Neural Networks
ECCV 2020
Cross-X Learning for Fine-Grained Visual Categorization
ICCV 2019
STEP: Spatio-Temporal Progressive Learning for Video Action Detection
CVPR 2019
Deep Multimodal Representation Learning From Temporal Data
CVPR 2017
Semantic Video Entity Linking Based on Visual Content and Metadata
ICCV 2015