Kristen Grauman
125 papers · 2006–2026 · 9 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+18 more ↓ Show less ↑
π§ Keyword Pioneer π£ Hot Topic Early Bird π Interdisciplinary Bridge πΊοΈ Taxonomy Completionist (16) π Conference Polyglot (9)
π
Interdisciplinary Bridge
π
Academic Marathon
(20)
π
Renaissance Researcher
(12)
π
Conference Loyalist
(23)
π
Keyword Trendsetter Combo
(17)
π€
Dynamic Duo
(18)
π±
Topic Pioneer
π§¬
Topic Evolution
π
Keyword Champion
π₯
Mega-Team
(100)
π¬
Deep Specialist
(26)
π
Trend Setter
π₯
Unstoppable
(17)
β
The Questioner
(2)
π
Conference Pioneer
β‘
Prolific Year
(13)
π
Century Club
(125)
ποΈ
Keyword Collector
(60)
Conferences
CVPR (58)
NIPS (23)
ICCV (20)
ECCV (15)
ICML (3)
ICLR (2)
WACV (2)
CORL (1)
JMLR (1)
Top co-authors
Research topics
Keywords
video understanding
(22)
egocentric video
(14)
egocentric vision
(10)
audio-visual learning
(8)
multimodal learning
(8)
convolutional neural network
(7)
active learning
(7)
representation learning
(7)
zero-shot learning
(6)
reinforcement learning
(6)
relative attribute
(6)
action recognition
(6)
transfer learning
(6)
self-supervised learning
(6)
metric learning
(6)
object recognition
(6)
activity recognition
(5)
image retrieval
(4)
contrastive learning
(4)
weakly supervised learning
(4)
Papers
SPOC: Spatially-Progressing Object State Change Segmentation in Video
WACV 2026
Viewpoint Rosetta Stone: Unlocking Unpaired Ego-Exo Videos for View-invariant Representation Learning
CVPR 2025
Progress-Aware Video Frame Captioning
CVPR 2025
Which Viewpoint Shows it Best? Language for Weakly Supervising View Selection in Multi-view Instructional Videos
CVPR 2025
FIction: 4D Future Interaction Prediction from Video
CVPR 2025
ExpertAF: Expert Actionable Feedback from Video
CVPR 2025
Switch-a-View: View Selection Learned from Unlabeled In-the-wild Videos
ICCV 2025
Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives
CVPR 2024
Detours for Navigating Instructional Videos
CVPR 2024
Learning Spatial Features from Audio-Visual Correspondence in Egocentric Videos
CVPR 2024
Learning Object State Changes in Videos: An Open-World Perspective
CVPR 2024
Action2Sound: Ambient-Aware Generation of Action Sounds from Egocentric Videos
ECCV 2024
Put Myself in Your Shoes: Lifting the Egocentric Perspective from Exocentric Videos
ECCV 2024
HOI-Swap: Swapping Objects in Videos with Hand-Object Interaction Awareness
NIPS 2024
4Diff: 3D-Aware Diffusion Model for Third-to-First Viewpoint Translation
ECCV 2024
SoundingActions: Learning How Actions Sound from Narrated Egocentric Videos
CVPR 2024
Self-Supervised Visual Acoustic Matching
NIPS 2023
EgoDistill: Egocentric Head Motion Distillation for Efficient Video Understanding
NIPS 2023
Learning Fine-grained View-Invariant Representations from Unpaired Ego-Exo Videos via Temporal Alignment
NIPS 2023
EgoEnv: Human-centric environment representations from egocentric video
NIPS 2023
Video-Mined Task Graphs for Keystep Recognition in Instructional Videos
NIPS 2023
EgoTracks: A Long-term Egocentric Visual Object Tracking Dataset
NIPS 2023
HierVL: Learning Hierarchical Video-Language Embeddings
CVPR 2023
Egocentric Video Task Translation
CVPR 2023
SpotEM: Efficient Video Search for Episodic Memory
ICML 2023
Chat2Map: Efficient Scene Mapping From Multi-Ego Conversations
CVPR 2023
Novel-View Acoustic Synthesis
CVPR 2023
NaQ: Leveraging Narrations As Queries To Supervise Episodic Memory
CVPR 2023
Single-Stage Visual Query Localization in Egocentric Videos
NIPS 2023
Ego4D: Around the World in 3,000 Hours of Egocentric Video
CVPR 2022
Zero Experience Required: Plug & Play Modular Transfer Learning for Semantic Visual Navigation
CVPR 2022
PONI: Potential Functions for ObjectGoal Navigation With Interaction-Free Learning
CVPR 2022
Environment Predictive Coding for Visual Navigation
ICLR 2022
Active Audio-Visual Separation of Dynamic Sound Sources
ECCV 2022
Egocentric Activity Recognition and Localization on a 3D Map
ECCV 2022
Few-Shot Audio-Visual Learning of Environment Acoustics
NIPS 2022
SoundSpaces 2.0: A Simulation Platform for Visual-Acoustic Learning
NIPS 2022
Discovering Underground Maps From Fashion
WACV 2022
Visual Acoustic Matching
CVPR 2022
VisualVoice: Audio-Visual Speech Separation With Cross-Modal Consistency
CVPR 2021
Shaping embodied agent behavior with activity-context priors from egocentric video
NIPS 2021
DexVIP: Learning Dexterous Grasping with Human Hand Pose Priors from Video
CORL 2021
Ego-Exo: Transferring Visual Representations From Third-Person to First-Person Videos
CVPR 2021
Semantic Audio-Visual Navigation
CVPR 2021
Fashion IQ: A New Dataset Towards Retrieving Images by Natural Language Feedback
CVPR 2021
From Culture to Clothing: Discovering the World Events Behind a Century of Fashion Images
ICCV 2021
Move2Hear: Active Audio-Visual Source Separation
ICCV 2021
Multiview Pseudo-Labeling for Semi-Supervised Learning From Video
ICCV 2021
Audio-Visual Floorplan Reconstruction
ICCV 2021
Anticipative Video Transformer
ICCV 2021
Learning to Set Waypoints for Audio-Visual Navigation
ICLR 2021
Listen to Look: Action Recognition by Previewing Audio
CVPR 2020
Proposal-based Video Completion
ECCV 2020
VisualEchoes: Spatial Image Representation Learning through Echolocation
ECCV 2020
Occupancy Anticipation for Efficient Exploration and Navigation
ECCV 2020
Learning Affordance Landscapes for Interaction Exploration in 3D Environments
NIPS 2020
From Paris to Berlin: Discovering Fashion Style Influences Around the World
CVPR 2020
ViBE: Dressing for Diverse Body Shapes
CVPR 2020
SoundSpaces: Audio-Visual Navigation in 3D Environments
ECCV 2020
Ego-Topo: Environment Affordances From Egocentric Video
CVPR 2020
You2Me: Inferring Body Pose in Egocentric Video via First and Second Person Interactions
CVPR 2020
Don't Judge an Object by Its Context: Learning to Overcome Contextual Bias
CVPR 2020
Kernel Transformer Networks for Compact Spherical Convolution
CVPR 2019
Grounded Human-Object Interaction Hotspots From Video
ICCV 2019
Extreme Relative Pose Estimation for RGB-D Scans via Scene Completion
CVPR 2019
Less Is More: Learning Highlight Detection From Video Duration
CVPR 2019
Thinking Outside the Pool: Active Training Image Creation for Relative Attributes
CVPR 2019
2.5D Visual Sound
CVPR 2019
Co-Separating Sounds of Visual Objects
ICCV 2019
Fashion++: Minimal Edits for Outfit Improvement
ICCV 2019
SpotTune: Transfer Learning Through Adaptive Fine-Tuning
CVPR 2019
Retrospective Encoders for Video Summarization
ECCV 2018
Compare and Contrast: Learning Prominent Visual Differences
CVPR 2018
Learning to Look Around: Intelligently Exploring Unseen Environments for Unknown Tasks
CVPR 2018
BlockDrop: Dynamic Inference Paths in Residual Networks
CVPR 2018
Learning Compressible 360Β° Video Isomers
CVPR 2018
Creating Capsule Wardrobes From Fashion Images
CVPR 2018
Im2Flow: Motion Hallucination From Static Images for Action Recognition
CVPR 2018
VizWiz Grand Challenge: Answering Visual Questions From Blind People
CVPR 2018
Learning to Separate Object Sounds by Watching Unlabeled Video
ECCV 2018
Attributes as Operators: Factorizing Unseen Attribute-Object Compositions
ECCV 2018
Sidekick Policy Learning for Active Visual Exploration
ECCV 2018
Snap Angle Prediction for 360Β° Panoramas
ECCV 2018
ShapeCodes: Self-Supervised Feature Learning by Lifting Views to Viewgrids
ECCV 2018
Fashion Forward: Forecasting Visual Style in Fashion
ICCV 2017
On-Demand Learning for Deep Image Restoration
ICCV 2017
Learning the Latent "Look": Unsupervised Discovery of a Style-Coherent Embedding From Fashion Images
ICCV 2017
Learning Spherical Convolution for Fast Features from 360Β° Imagery
NIPS 2017
FusionSeg: Learning to Combine Motion and Appearance for Fully Automatic Segmentation of Generic Objects in Videos
CVPR 2017
Seeing Invisible Poses: Estimating 3D Body Pose From Egocentric Video
CVPR 2017
Detangling People: Individuating Multiple Close People and Their Body Parts via Region Assembly
CVPR 2017
Semantic Jitter: Dense Supervision for Visual Comparisons via Synthetic Images
ICCV 2017
Making 360deg Video Watchable in 2D: Learning Videography for Click Free Viewing
CVPR 2017
Slow and Steady Feature Analysis: Higher Order Temporal Coherence in Video
CVPR 2016
Active Image Segmentation Propagation
CVPR 2016
Summary Transfer: Exemplar-Based Subset Selection for Video Summarization
CVPR 2016
Pull the Plug? Predicting If Computers or Humans Should Segment Images
CVPR 2016
Just Noticeable Differences in Visual Attributes
ICCV 2015
Learning Image Representations Tied to Ego-Motion
ICCV 2015
Diverse Sequential Subset Selection for Supervised Video Summarization
NIPS 2014
Predicting Useful Neighborhoods for Lazy Local Learning
NIPS 2014
Zero-shot recognition with unreliable attributes
NIPS 2014
Inferring Unseen Views of People
CVPR 2014
Decorrelating Semantic Visual Attributes by Resisting the Urge to Share
CVPR 2014
Beyond Comparing Image Pairs: Setwise Active Learning for Relative Attributes
CVPR 2014
Fine-Grained Visual Comparisons with Local Learning
CVPR 2014
Inferring Analogous Attributes
CVPR 2014
Analogy-preserving Semantic Embedding for Visual Object Categorization
ICML 2013
Predicting Sufficient Annotation Strength for Interactive Foreground Segmentation
ICCV 2013
Implied Feedback: Learning Nuances of User Behavior in Image Search
ICCV 2013
Attribute Pivots for Guiding Relevance Feedback in Image Search
ICCV 2013
Active Learning of an Action Detector from Untrimmed Videos
ICCV 2013
Attribute Adaptation for Personalized Image Search
ICCV 2013
Reshaping Visual Datasets for Domain Adaptation
NIPS 2013
Deformable Spatial Pyramid Matching for Fast Dense Correspondences
CVPR 2013
Story-Driven Summarization for Egocentric Video
CVPR 2013
Watching Unlabeled Video Helps Learn New Human Actions from Very Few Labeled Snapshots
CVPR 2013
Connecting the Dots with Landmarks: Discriminatively Learning Domain-Invariant Features for Unsupervised Domain Adaptation
ICML 2013
Semantic Kernel Forests from Multiple Taxonomies
NIPS 2012
Learning a Tree of Metrics with Disjoint Visual Features
NIPS 2011
Hashing Hyperplane Queries to Near Points with Applications to Large-Scale Active Learning
NIPS 2010
Online Metric Learning and Fast Similarity Search
NIPS 2008
Multi-Level Active Prediction of Useful Image Annotations for Recognition
NIPS 2008
The Pyramid Match Kernel: Efficient Learning with Sets of Features
JMLR 2007
Approximate Correspondences in High Dimensions
NIPS 2006