Sangho Lee
16 papers · 2017–2025 · 8 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+7 more ↓ Show less ↑
π Academic Marathon (8) π§ Keyword Pioneer π Interdisciplinary Bridge π Conference Polyglot (8) π Cross-Pollinator (13)
π
Academic Marathon
(8)
πΊοΈ
Taxonomy Completionist
(37)
π
Cross-Pollinator
(13)
π₯
Mega-Team
(50)
β
The Questioner
ποΈ
Keyword Collector
(79)
π
Century Club
(16)
Conferences
CVPR (5)
AAAI (3)
ICCV (2)
ICLR (2)
AISTATS (1)
ECCV (1)
EMNLP (1)
ICML (1)
Top co-authors
Keywords
multimodal learning
(3)
image generation
(2)
mutual information
(2)
self-supervised learning
(2)
benchmark evaluation
(1)
image captioning
(1)
transformer architecture
(1)
pose estimation
(1)
contrastive learning
(1)
visual question answering
(1)
video captioning
(1)
online learning
(1)
audio-visual learning
(1)
self-attention mechanism
(1)
representation learning
(1)
depth estimation
(1)
image synthesis
(1)
instruction following
(1)
efficient computing
(1)
unsupervised learning
(1)
Papers
One Diffusion to Generate Them All
CVPR 2025
MAMS: Model-Agnostic Module Selection Framework for Video Captioning
AAAI 2025
Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Vision-Language Models
CVPR 2025
ReSpec: Relevance and Specificity Grounded Online Filtering for Learning on Video-Text Data Streams
CVPR 2025
Finding NeMo: Negative-mined Mosaic Augmentation for Referring Image Segmentation
ECCV 2024
Proxyformer: NystrΓΆm-Based Linear Transformer with Trainable Proxy Tokens
AAAI 2024
Towards a Complete Benchmark on Video Moment Localization
AISTATS 2024
Unified-IO 2: Scaling Autoregressive Multimodal Models with Vision Language Audio and Action
CVPR 2024
Can Language Models Laugh at YouTube Short-form Videos?
EMNLP 2023
Unsupervised Representation Learning via Neural Activation Coding
ICML 2021
ACAV100M: Automatic Curation of Large-Scale Datasets for Audio-Visual Video Representation Learning
ICCV 2021
Parameter Efficient Multimodal Transformers for Video Representation Learning
ICLR 2021
Self-Supervised Learning of Compressed Video Representations
ICLR 2021
URNet: User-Resizable Residual Networks with Conditional Gating Module
AAAI 2020
A Memory Network Approach for Story-Based Temporal Summarization of 360Β° Videos
CVPR 2018
A Read-Write Memory Network for Movie Story Understanding
ICCV 2017