Yale Song

37 papers · 2013–2026 · 11 conferences · across top CS/AI conferences

Achievements

+11 more ↓

🌍 Conference Polyglot (11) 🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🐣 Hot Topic Early Bird 🏃 Academic Marathon (13)

🌉 Interdisciplinary Bridge 🌍 Conference Polyglot (11) 🏃 Academic Marathon (13) 🏆 Grand Slam 👥 Mega-Team (100) 🗃️ Keyword Collector (151) 🚀 Conference Pioneer 💎 Century Club (37) 🔥 Unstoppable (12) 📈 Trend Setter ⚡ Prolific Year (5)

Conferences

CVPR (12) ICCV (6) ICLR (4) IJCAI (3) NIPS (3) WACV (3) ICML (2) AAAI (1) CLEAR (1) ECCV (1) INTERSPEECH (1)

Top co-authors

Daniel McDuff (8) Shuang Ma (7) Lorenzo Torresani (5) Gunhee Kim (5) Randall Davis (4) Xin Wang (4) Youngjae Yu (4) Vibhav Vineet (4) Neel Joshi (4) Alejandro Jaimes (3)

Keywords

video understanding (8) multimodal learning (5) multiple instance learning (3) transfer learning (3) mutual information (2) recurrent neural network (2) generative adversarial network (2) action recognition (2) unsupervised learning (2) contrastive learning (2) domain adaptation (2) representation learning (2) activity recognition (2) video summarization (2) video generation (2) vision transformer (2) self-supervised learning (2) attention mechanism (2) egocentric vision (2) multimodal large language model (2)

Papers

Enhancing Visual Planning with Auxiliary Tasks and Multi-token Prediction WACV 2026 VITED: Video Temporal Evidence Distillation CVPR 2025 Streaming VideoLLMs for Real-Time Procedural Video Understanding ICCV 2025 Enrich and Detect: Video Temporal Grounding with Multimodal LLMs ICCV 2025 Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives CVPR 2024 Egocentric Video Task Translation CVPR 2023 Scaling Novel Object Detection With Weakly Supervised Detection Transformers WACV 2023 EgoVLPv2: Egocentric Video-Language Pre-training with Fusion in the Backbone ICCV 2023 Ego4D Goal-Step: Toward Hierarchical Understanding of Procedural Activities NIPS 2023 Visual Attention Emerges from Recurrent Sparse Reconstruction ICML 2022 CausalCity: Complex Simulations with Agency for Causal Discovery and Reasoning CLEAR 2022 Neural-Sim: Learning to Generate Training Data with NeRF ECCV 2022 DOC2PPT: Automatic Presentation Slides Generation from Scientific Documents AAAI 2022 Robust Contrastive Learning Against Noisy Views CVPR 2022 ACAV100M: Automatic Curation of Large-Scale Datasets for Audio-Visual Video Representation Learning ICCV 2021 Active Contrastive Learning of Audio-Visual Video Representations ICLR 2021 Self-Supervised Learning of Compressed Video Representations ICLR 2021 Contrastive Learning of Global and Local Video Representations NIPS 2021 Parameter Efficient Multimodal Transformers for Video Representation Learning ICLR 2021 Multi-Reference Neural TTS Stylization with Adversarial Cycle Consistency INTERSPEECH 2020 Image to Video Domain Adaptation Using Web Supervision WACV 2020 Characterizing Bias in Classifiers using Generative Models NIPS 2019 Polysemous Visual-Semantic Embedding for Cross-Modal Retrieval CVPR 2019 Unpaired Image-to-Speech Synthesis With Multimodal Information Bottleneck ICCV 2019 Neural TTS Stylization with Adversarial and Collaborative Games ICLR 2019 Video Prediction with Appearance and Motion Conditions ICML 2018 Improving Pairwise Ranking for Multi-Label Image Classification CVPR 2017 TGIF-QA: Toward Spatio-Temporal Reasoning in Visual Question Answering CVPR 2017 Learning From Noisy Labels With Distillation ICCV 2017 Balancing Appearance and Context in Sketch Interpretation IJCAI 2016 TGIF: A New Dataset and Benchmark on Animated GIF Description CVPR 2016 Video2GIF: Automatic Generation of Animated GIFs From Video CVPR 2016 Continuous Body and Hand Gesture Recognition for Natural Human-Computer Interaction: Extended Abstract IJCAI 2015 TVSum: Summarizing Web Videos Using Titles CVPR 2015 Video Co-Summarization: Video Summarization by Visual Co-Occurrence CVPR 2015 One-Class Conditional Random Fields for Sequential Anomaly Detection IJCAI 2013 Action Recognition by Hierarchical Sequence Summarization CVPR 2013