Michael S. Ryoo

38 papers · 2013–2025 · 9 conferences · across top CS/AI conferences

Achievements

+14 more ↓

🌈 Renaissance Researcher (7) 🌉 Interdisciplinary Bridge 🏃 Academic Marathon (12) 🌍 Conference Polyglot (9) 🗺️ Taxonomy Completionist (56)

🗺️ Taxonomy Completionist (56) 🧭 Keyword Pioneer 🐣 Hot Topic Early Bird 🏆 Keyword Champion (6) 🧬 Topic Evolution 🤝 Dynamic Duo (14) 👥 Mega-Team (55) 🚀 Conference Pioneer 💎 Century Club (38) 🗃️ Keyword Collector (141) 🔥 Unstoppable (9) 📈 Trend Setter ❓ The Questioner ⚡ Prolific Year (5)

Conferences

CVPR (15) ECCV (8) CORL (3) ICCV (3) WACV (3) AAAI (2) NIPS (2) ICLR (1) IJCAI (1)

Top co-authors

AJ Piergiovanni (14) Anelia Angelova (10) Kumara Kahatapitiya (9) Xiang Li (3) Jinghuan Shang (3) Srijan Das (3) Yong Jae Lee (3) Kanchana Ranasinghe (3) Ted Xiao (2) David J. Crandall (2)

Research topics

Computer Vision (1)

Keywords

video understanding (8) activity recognition (6) self-supervised learning (5) representation learning (4) action recognition (4) video representation (3) temporal activity detection (3) convolutional neural network (3) contrastive learning (3) vision transformer (3) action detection (2) multi-scale feature (2) transfer learning (2) temporal structure (2) viewpoint invariance (2) multimodal learning (2) attention mechanism (2) model architecture (2) autoregressive model (2) robot action policy (2)

Papers

Adaptive Caching for Faster Video Generation with Diffusion Transformers ICCV 2025 Limited Data, Unlimited Potential: A Study on ViTs Augmented by Masked Autoencoders WACV 2024 Learning to Localize Objects Improves Spatial Reasoning in Visual-LLMs CVPR 2024 Mirasol3B: A Multimodal Autoregressive Model for Time-Aligned and Contextual Modalities CVPR 2024 MAGICK: A Large-scale Captioned Dataset from Matting Generated Images using Chroma Keying CVPR 2024 VicTR: Video-conditioned Text Representations for Activity Recognition CVPR 2024 Grafting Vision Transformers WACV 2024 Language-based Action Concept Spaces Improve Video Self-Supervised Learning NIPS 2023 Weakly-Guided Self-Supervised Pretraining for Temporal Activity Detection AAAI 2023 Token Turing Machines CVPR 2023 SWAT: Spatial Structure Within and Among Tokens IJCAI 2023 ViewCLR: Learning Self-Supervised Video Representation for Unseen Viewpoints WACV 2023 Active Vision Reinforcement Learning under Limited Visual Observability NIPS 2023 RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control CORL 2023 Video Question Answering with Iterative Video-Text Co-Tokenization ECCV 2022 MS-TCT: Multi-Scale Temporal ConvTransformer for Action Detection CVPR 2022 Self-Supervised Video Transformer CVPR 2022 TRITON: Neural Neural Textures for Better Sim2Real CORL 2022 StARformer: Transformer with State-Action-Reward Representations for Visual Reinforcement Learning ECCV 2022 Recognizing Actions in Videos From Unseen Viewpoints CVPR 2021 4D-Net for Learned Multi-Modal Alignment ICCV 2021 Coarse-Fine Networks for Temporal Activity Detection in Videos CVPR 2021 Evolving Losses for Unsupervised Video Representation Learning CVPR 2020 Differentiable Grammars for Videos AAAI 2020 Adversarial Generative Grammars for Human Activity Prediction ECCV 2020 AttentionNAS: Spatiotemporal Attention Cell Search for Video Classification ECCV 2020 Password-conditioned Anonymization and Deanonymization with Face Identity Transformers ECCV 2020 AssembleNet++: Assembling Modality Representations via Attention Connections - Supplementary Material - ECCV 2020 AssembleNet: Searching for Multi-Stream Neural Connectivity in Video Architectures ICLR 2020 Model-based Behavioral Cloning with Future Image Similarity Learning CORL 2019 Representation Flow for Action Recognition CVPR 2019 Evolving Space-Time Neural Architectures for Videos ICCV 2019 Learning to Anonymize Faces for Privacy Preserving Action Detection ECCV 2018 Learning Latent Super-Events to Detect Multiple Activities in Videos CVPR 2018 Joint Person Segmentation and Identification in Synchronized First- and Third-person Videos ECCV 2018 Identifying First-Person Camera Wearers in Third-Person Videos CVPR 2017 Pooled Motion Features for First-Person Videos CVPR 2015 First-Person Activity Recognition: What Are They Doing to Me? CVPR 2013