Carl Vondrick
79 papers · 2011–2025 · 13 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+18 more ↓ Show less ↑
π£ Hot Topic Early Bird π Conference Polyglot (13) π§ Keyword Pioneer π Interdisciplinary Bridge π Academic Marathon (14)
π
Interdisciplinary Bridge
π§
Keyword Pioneer
π£
Hot Topic Early Bird
π
Keyword Trendsetter Combo
(6)
π
Conference Loyalist
(24)
π€
Dynamic Duo
(15)
π₯
Mega-Team
(56)
π
Triple Crown
π¬
Deep Specialist
(11)
π
Keyword Champion
π
Grand Slam
β
The Questioner
ποΈ
Keyword Collector
(277)
π
Century Club
(79)
π
Conference Pioneer
π₯
Unstoppable
(11)
π
Trend Setter
β‘
Prolific Year
(12)
Conferences
CVPR (24)
ICCV (12)
ECCV (11)
NIPS (11)
ICLR (8)
CORL (3)
ICML (3)
NAACL (2)
AAAI (1)
ACL (1)
EMNLP (1)
MLHC (1)
UAI (1)
Top co-authors
Keywords
representation learning
(10)
self-supervised learning
(9)
video understanding
(8)
multimodal learning
(6)
3d reconstruction
(5)
zero-shot learning
(4)
action recognition
(4)
visual reasoning
(3)
transfer learning
(3)
convolutional neural network
(3)
differentiable rendering
(3)
out-of-distribution generalization
(3)
pose estimation
(3)
adversarial attack
(3)
occlusion reasoning
(3)
video prediction
(3)
code generation
(2)
adversarial robustness
(2)
cross-modal learning
(2)
object detection
(2)
Papers
Generative Data Mining with Longtail-Guided Diffusion
ICML 2025
MINERVA: Evaluating Complex Video Reasoning
ICCV 2025
DiSciPLE: Learning Interpretable Programs for Scientific Visual Discovery
CVPR 2025
SelfIE: Self-Interpretation of Large Language Model Embeddings
ICML 2024
Raidar: geneRative AI Detection viA Rewriting
ICLR 2024
INViTE: INterpret and Control Vision-Language Models with Text Explanations
ICLR 2024
Sin3DM: Learning a Diffusion Model from a Single 3D Textured Shape
ICLR 2024
Remote Sensing Vision-Language Foundation Models without Annotations via Ground Remote Alignment
ICLR 2024
MedAutoCorrect: Image-Conditioned Autocorrection in Medical Reporting
MLHC 2024
Whiteboard-of-Thought: Thinking Step-by-Step Across Modalities
EMNLP 2024
Differentiable Robot Rendering
CORL 2024
Dreamitate: Real-World Visuomotor Policy Learning via Video Generation
CORL 2024
EraseDraw : Learning to Insert Objects by Erasing Them from Images
ECCV 2024
Discovering Unwritten Visual Classifiers with Large Language Models
ECCV 2024
Controlling the World by Sleight of Hand
ECCV 2024
Generative Camera Dolly: Extreme Monocular Dynamic Novel View Synthesis
ECCV 2024
How Video Meetings Change Your Expression
ECCV 2024
GES : Generalized Exponential Splatting for Efficient Radiance Field Rendering
CVPR 2024
pix2gestalt: Amodal Segmentation by Synthesizing Wholes
CVPR 2024
Muscles in Action
ICCV 2023
ClimSim: A large multi-scale dataset for hybrid physics-ML climate emulation
NIPS 2023
Objaverse-XL: A Universe of 10M+ 3D Objects
NIPS 2023
Doubly Right Object Recognition: A Why Prompt for Visual Rationales
CVPR 2023
FLEX: Full-Body Grasping Without Full-Body Grasps
CVPR 2023
Tracking Through Containers and Occluders in the Wild
CVPR 2023
Humans As Light Bulbs: 3D Human Reconstruction From Thermal Reflection
CVPR 2023
What You Can Reconstruct From a Shadow
CVPR 2023
SHIFT3D: Synthesizing Hard Inputs For Tricking 3D Detectors
ICCV 2023
SurfsUP: Learning Fluid Simulation for Novel Surfaces
ICCV 2023
Zero-1-to-3: Zero-shot One Image to 3D Object
ICCV 2023
ViperGPT: Visual Inference via Python Execution for Reasoning
ICCV 2023
Landscape Learning for Neural Network Inversion
ICCV 2023
Understanding Zero-shot Adversarial Robustness for Large-Scale Models
ICLR 2023
Visual Classification via Description from Large Language Models
ICLR 2023
Robust Perception through Equivariance
ICML 2023
Forget-me-not! Contrastive critics for mitigating posterior collapse
UAI 2022
Globetrotter: Connecting Languages by Connecting Images
CVPR 2022
Thereβs a Time and Place for Reasoning Beyond the Image
ACL 2022
Causal Transportability for Visual Recognition
CVPR 2022
Revealing Occlusions With 4D Neural Fields
CVPR 2022
UnweaveNet: Unweaving Activity Stories
CVPR 2022
Real-Time Neural Voice Camouflage
ICLR 2022
Discrete Representations Strengthen Vision Transformer Robustness
ICLR 2022
Private Multiparty Perception for Navigation
NIPS 2022
RESIN-11: Schema-guided Event Prediction for 11 Newsworthy Scenarios
NAACL 2022
Representing Spatial Trajectories as Distributions
NIPS 2022
It's Time for Artistic Correspondence in Music and Video
CVPR 2022
RESIN: A Dockerized Schema-Guided Cross-document Cross-lingual Cross-media Information Extraction and Event Tracking System
NAACL 2021
Dissecting Image Crops
ICCV 2021
Adversarial Attacks Are Reversible With Natural Supervision
ICCV 2021
Learning the Predictability of the Future
CVPR 2021
Generative Interventions for Causal Learning
CVPR 2021
Towards a Unifying Framework for Formal Theories of Novelty
AAAI 2021
The Boombox: Visual Reconstruction from Acoustic Vibrations
CORL 2021
Learning Goals From Failure
CVPR 2021
Listening to Sounds of Silence for Speech Denoising
NIPS 2020
Multitask Learning Strengthens Adversarial Robustness
ECCV 2020
We Have So Much In Common: Modeling Semantic Relational Set Abstractions in Videos
ECCV 2020
Learning to Learn Words from Visual Scenes
ECCV 2020
Oops! Predicting Unintentional Action in Video
CVPR 2020
VideoBERT: A Joint Model for Video and Language Representation Learning
ICCV 2019
Relational Action Forecasting
CVPR 2019
Multi-Level Multimodal Common Semantic Space for Image-Phrase Grounding
CVPR 2019
Metric Learning for Adversarial Robustness
NIPS 2019
AVA: A Video Dataset of Spatio-Temporally Localized Atomic Visual Actions
CVPR 2018
The Sound of Pixels
ECCV 2018
Actor-centric Relation Network
ECCV 2018
Tracking Emerges by Colorizing Videos
ECCV 2018
Generating the Future With Adversarial Transformers
CVPR 2017
Following Gaze in Video
ICCV 2017
Generating Videos with Scene Dynamics
NIPS 2016
SoundNet: Learning Sound Representations from Unlabeled Video
NIPS 2016
Anticipating Visual Representations From Unlabeled Video
CVPR 2016
Learning Aligned Cross-Modal Representations From Weakly Aligned Data
CVPR 2016
Predicting Motivations of Actions by Leveraging Text
CVPR 2016
Learning visual biases from human imagination
NIPS 2015
Where are they looking?
NIPS 2015
HOGgles: Visualizing Object Detection Features
ICCV 2013
Video Annotation and Tracking with Active Learning
NIPS 2011