James M. Rehg

66 papers · 2011–2025 · 10 conferences · across top CS/AI conferences

Achievements

+13 more ↓

🌍 Conference Polyglot (10) 🌉 Interdisciplinary Bridge 🐣 Hot Topic Early Bird 🧭 Keyword Pioneer 🏃 Academic Marathon (14)

🧭 Keyword Pioneer 🗺️ Taxonomy Completionist (114) 🐣 Hot Topic Early Bird 🏠 Conference Loyalist (36) 🔬 Deep Specialist (13) 👥 Mega-Team (100) 🏆 Keyword Champion (3) 🗃️ Keyword Collector (321) 🚀 Conference Pioneer 📈 Trend Setter 💎 Century Club (66) 🔥 Unstoppable (13) ⚡ Prolific Year (7)

Conferences

CVPR (36) ECCV (8) NIPS (8) ICCV (6) CORL (2) ICML (2) AACL (1) IJCNLP (1) JMLR (1) MLHC (1)

Top co-authors

Yin Li (9) Stefan Stojanov (8) Zixuan Huang (8) Miao Liu (8) Fuxin Li (7) Anh Thai (7) Fiona Ryan (7) Ahmad Humayun (6) Le Song (6) Weiyang Liu (5)

Keywords

multimodal learning (8) egocentric video (6) diffusion model (6) few-shot learning (5) object detection (4) self-supervised learning (4) transformer architecture (4) video understanding (4) object recognition (4) action recognition (4) pose estimation (4) zero-shot learning (4) autonomous driving (3) low-shot learning (3) hidden markov model (3) appearance model (3) feature extraction (3) object tracking (3) convolutional neural network (3) image segmentation (2)

Papers

SPAR3D: Stable Point-Aware Reconstruction of 3D Objects from Single Images CVPR 2025 Symmetry Strikes Back: From Single-Image Symmetry Detection to 3D Generation CVPR 2025 Unleashing In-context Learning of Autoregressive Models for Few-shot Image Manipulation CVPR 2025 ShotAdapter: Text-to-Multi-Shot Video Generation with Diffusion Models CVPR 2025 Gaze-LLE: Gaze Target Estimation via Large-Scale Learned Encoders CVPR 2025 SocialGesture: Delving into Multi-person Gesture Understanding CVPR 2025 Improving Personalized Search with Regularized Low-Rank Parameter Updates CVPR 2025 MAPLM: A Real-World Large-Scale Vision-Language Benchmark for Map and Traffic Scene Understanding CVPR 2024 Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives CVPR 2024 Modeling Multimodal Social Interactions: New Challenges and Baselines with Densely Aligned Representations CVPR 2024 ZeroShape: Regression-based Zero-shot Shape Reconstruction CVPR 2024 RAVE: Randomized Noise Shuffling for Fast and Consistent Video Editing with Diffusion Models CVPR 2024 The Audio-Visual Conversational Graph: From an Egocentric-Exocentric Perspective CVPR 2024 PointInfinity: Resolution-Invariant Point Diffusion Models CVPR 2024 LaMPilot: An Open Benchmark Dataset for Autonomous Driving with Language Model Programs CVPR 2024 Egocentric Auditory Attention Localization in Conversations CVPR 2023 ShapeClipper: Scalable 3D Shape Learning From Single-View Images via Geometric and CLIP-Based Consistency CVPR 2023 Low-shot Object Learning with Mutual Exclusivity Bias NIPS 2023 Transformer-based Localization from Embodied Dialog with Large-scale Pre-training AACL 2022 Learning Dense Object Descriptors from Multiple Views for Low-shot Category Generalization NIPS 2022 Kernel Multimodal Continuous Attention NIPS 2022 PulseImpute: A Novel Benchmark Task for Pulsative Physiological Signal Imputation NIPS 2022 Transformer-based Localization from Embodied Dialog with Large-scale Pre-training IJCNLP 2022 Generative Adversarial Network for Future Hand Segmentation from Egocentric Video ECCV 2022 Egocentric Activity Recognition and Localization on a 3D Map ECCV 2022 Planes vs. Chairs: Category-Guided 3D Shape Learning without Any 3D Cues ECCV 2022 Ego4D: Around the World in 3,000 Hours of Egocentric Video CVPR 2022 Using Shape To Categorize: Low-Shot Learning With an Explicit Shape Bias CVPR 2021 Orthogonal Over-Parameterized Training CVPR 2021 No RL, No Simulation: Learning to Navigate without Navigating NIPS 2021 Discriminative Appearance Modeling With Multi-Track Pooling for Real-Time Multi-Object Tracking CVPR 2021 Detecting Attended Visual Targets in Video CVPR 2020 A Robust Functional EM Algorithm for Incomplete Panel Count Data NIPS 2020 Forecasting Human-Object Interaction: Joint Prediction of Motor Attention and Actions in First Person Video ECCV 2020 Regularizing Neural Networks via Minimizing Hyperspherical Energy CVPR 2020 Learning to Generate Synthetic Data via Compositing CVPR 2019 Locally Weighted Regression Pseudo-Rehearsal for Adaptive Model Predictive Control CORL 2019 Neural Similarity Learning NIPS 2019 A Spatiotemporal Approach to Predicting Glaucoma Progression Using a CT-HMM MLHC 2019 Incremental Object Learning From Contiguous Views CVPR 2019 Unsupervised 3D Pose Estimation With Geometric Self-Supervision CVPR 2019 Taking a Deeper Look at the Inverse Compositional Algorithm CVPR 2019 Multi-object Tracking with Neural Gating Using Bilinear LSTM ECCV 2018 Decoupled Networks CVPR 2018 3D-RCNN: Instance-Level 3D Object Reconstruction via Render-and-Compare CVPR 2018 Learning Rigidity in Dynamic Scenes with a Moving Camera for 3D Motion Field Estimation ECCV 2018 Connecting Gaze, Scene, and Attention: Generalized Attention Estimation via Joint Modeling of Gaze and Scene Saliency ECCV 2018 In the Eye of Beholder: Joint Learning of Gaze and Actions in First Person Video ECCV 2018 iSurvive: An Interpretable, Event-time Prediction Model for mHealth ICML 2017 Aggressive Deep Driving: Combining Convolutional Neural Networks and Model Predictive Control CORL 2017 Iterative Machine Teaching ICML 2017 Unsupervised Learning of Edges CVPR 2016 Multiple Hypothesis Tracking Revisited ICCV 2015 Robust Video Segment Proposals With Painless Occlusion Handling CVPR 2015 Delving Into Egocentric Actions CVPR 2015 Efficient Learning of Continuous-Time Hidden Markov Models for Disease Progression NIPS 2015 The Middle Child Problem: Revisiting Parametric Min-Cut and Seeds for Object Proposals ICCV 2015 Minimizing Human Effort in Interactive Tracking by Incremental Learning of Model Parameters ICCV 2015 Gaze-Enabled Egocentric Video Summarization via Constrained Submodular Maximization CVPR 2015 The Secrets of Salient Object Segmentation CVPR 2014 RIGOR: Reusing Inference in Graph Cuts for Generating Object Regions CVPR 2014 Modeling Actions through State Changes CVPR 2013 GOSUS: Grassmannian Online Subspace Updates with Structured-Sparsity ICCV 2013 Learning to Predict Gaze in Egocentric Video ICCV 2013 Video Segmentation by Tracking Many Figure-Ground Segments ICCV 2013 Efficient and Effective Visual Codebook Generation Using Additive Kernels JMLR 2011