Carl Doersch
23 papers · 2012–2025 · 8 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+9 more ↓ Show less ↑
🏃 Academic Marathon (13) 🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🌍 Conference Polyglot (8) 🐝 Cross-Pollinator (4)
🌍
Conference Polyglot
(8)
🏃
Academic Marathon
(13)
🐣
Hot Topic Early Bird
👥
Mega-Team
(34)
🤝
Dynamic Duo
(12)
🧬
Topic Evolution
💎
Century Club
(23)
🗃️
Keyword Collector
(96)
🚀
Conference Pioneer
Conferences
NIPS (8)
CVPR (5)
ICCV (4)
ICML (2)
CORL (1)
ECCV (1)
ICLR (1)
JMLR (1)
Top co-authors
Keywords
video understanding
(6)
self-supervised learning
(4)
scene understanding
(3)
point tracking
(3)
motion estimation
(3)
transfer learning
(2)
representation learning
(2)
synthetic data generation
(2)
video tracking
(2)
few-shot learning
(2)
depth estimation
(2)
domain adaptation
(2)
3d vision
(2)
optical flow
(2)
online learning
(1)
contrastive learning
(1)
temporal modeling
(1)
action recognition
(1)
3d reconstruction
(1)
transformer architecture
(1)
Papers
Gen2Act: Human Video Generation in Novel Scenarios enables Generalizable Robot Manipulation
CORL 2025
Direct Motion Models for Assessing Generated Videos
ICML 2025
TAPNext: Tracking Any Point (TAP) as Next Token Prediction
ICCV 2025
Learning from One Continuous Video Stream
CVPR 2024
TAPVid-3D: A Benchmark for Tracking Any Point in 3D
NIPS 2024
Moving Off-the-Grid: Scene-Grounded Video Representations
NIPS 2024
TAPIR: Tracking Any Point with Per-Frame Initialization and Temporal Refinement
ICCV 2023
Perception Test: A Diagnostic Benchmark for Multimodal Video Models
NIPS 2023
Kubric: A Scalable Dataset Generator
CVPR 2022
Perceiver IO: A General Architecture for Structured Inputs & Outputs
ICLR 2022
Input-Level Inductive Biases for 3D Reconstruction
CVPR 2022
TAP-Vid: A Benchmark for Tracking Any Point in a Video
NIPS 2022
CrossTransformers: spatially-aware few-shot transfer
NIPS 2020
Bootstrap Your Own Latent - A New Approach to Self-Supervised Learning
NIPS 2020
Structured agents for physical construction
ICML 2019
Sim2real transfer learning for 3D human pose estimation: motion to the rescue
NIPS 2019
Video Action Transformer Network
CVPR 2019
Exploiting Temporal Context for 3D Human Pose Estimation in the Wild
CVPR 2019
Learning Visual Question Answering by Bootstrapping Hard Attention
ECCV 2018
Multi-Task Self-Supervised Visual Learning
ICCV 2017
Unsupervised Visual Representation Learning by Context Prediction
ICCV 2015
Mid-level Visual Element Discovery as Discriminative Mode Seeking
NIPS 2013
Bounding the Probability of Error for High Precision Optical Character Recognition
JMLR 2012