Yale Song
37 papers · 2013–2026 · 11 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+11 more ↓ Show less ↑
🌍 Conference Polyglot (11) 🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🐣 Hot Topic Early Bird 🏃 Academic Marathon (13)
🌉
Interdisciplinary Bridge
🌍
Conference Polyglot
(11)
🏃
Academic Marathon
(13)
🏆
Grand Slam
👥
Mega-Team
(100)
🗃️
Keyword Collector
(151)
🚀
Conference Pioneer
💎
Century Club
(37)
🔥
Unstoppable
(12)
📈
Trend Setter
⚡
Prolific Year
(5)
Conferences
CVPR (12)
ICCV (6)
ICLR (4)
IJCAI (3)
NIPS (3)
WACV (3)
ICML (2)
AAAI (1)
CLEAR (1)
ECCV (1)
INTERSPEECH (1)
Top co-authors
Keywords
video understanding
(8)
multimodal learning
(5)
multiple instance learning
(3)
transfer learning
(3)
mutual information
(2)
recurrent neural network
(2)
generative adversarial network
(2)
action recognition
(2)
unsupervised learning
(2)
contrastive learning
(2)
domain adaptation
(2)
representation learning
(2)
activity recognition
(2)
video summarization
(2)
video generation
(2)
vision transformer
(2)
self-supervised learning
(2)
attention mechanism
(2)
egocentric vision
(2)
multimodal large language model
(2)
Papers
Enhancing Visual Planning with Auxiliary Tasks and Multi-token Prediction
WACV 2026
VITED: Video Temporal Evidence Distillation
CVPR 2025
Streaming VideoLLMs for Real-Time Procedural Video Understanding
ICCV 2025
Enrich and Detect: Video Temporal Grounding with Multimodal LLMs
ICCV 2025
Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives
CVPR 2024
Egocentric Video Task Translation
CVPR 2023
Scaling Novel Object Detection With Weakly Supervised Detection Transformers
WACV 2023
EgoVLPv2: Egocentric Video-Language Pre-training with Fusion in the Backbone
ICCV 2023
Ego4D Goal-Step: Toward Hierarchical Understanding of Procedural Activities
NIPS 2023
Visual Attention Emerges from Recurrent Sparse Reconstruction
ICML 2022
CausalCity: Complex Simulations with Agency for Causal Discovery and Reasoning
CLEAR 2022
Neural-Sim: Learning to Generate Training Data with NeRF
ECCV 2022
DOC2PPT: Automatic Presentation Slides Generation from Scientific Documents
AAAI 2022
Robust Contrastive Learning Against Noisy Views
CVPR 2022
ACAV100M: Automatic Curation of Large-Scale Datasets for Audio-Visual Video Representation Learning
ICCV 2021
Active Contrastive Learning of Audio-Visual Video Representations
ICLR 2021
Self-Supervised Learning of Compressed Video Representations
ICLR 2021
Contrastive Learning of Global and Local Video Representations
NIPS 2021
Parameter Efficient Multimodal Transformers for Video Representation Learning
ICLR 2021
Multi-Reference Neural TTS Stylization with Adversarial Cycle Consistency
INTERSPEECH 2020
Image to Video Domain Adaptation Using Web Supervision
WACV 2020
Characterizing Bias in Classifiers using Generative Models
NIPS 2019
Polysemous Visual-Semantic Embedding for Cross-Modal Retrieval
CVPR 2019
Unpaired Image-to-Speech Synthesis With Multimodal Information Bottleneck
ICCV 2019
Neural TTS Stylization with Adversarial and Collaborative Games
ICLR 2019
Video Prediction with Appearance and Motion Conditions
ICML 2018
Improving Pairwise Ranking for Multi-Label Image Classification
CVPR 2017
TGIF-QA: Toward Spatio-Temporal Reasoning in Visual Question Answering
CVPR 2017
Learning From Noisy Labels With Distillation
ICCV 2017
Balancing Appearance and Context in Sketch Interpretation
IJCAI 2016
TGIF: A New Dataset and Benchmark on Animated GIF Description
CVPR 2016
Video2GIF: Automatic Generation of Animated GIFs From Video
CVPR 2016
Continuous Body and Hand Gesture Recognition for Natural Human-Computer Interaction: Extended Abstract
IJCAI 2015
TVSum: Summarizing Web Videos Using Titles
CVPR 2015
Video Co-Summarization: Video Summarization by Visual Co-Occurrence
CVPR 2015
One-Class Conditional Random Fields for Sequential Anomaly Detection
IJCAI 2013
Action Recognition by Hierarchical Sequence Summarization
CVPR 2013