Sanja Fidler

166 papers · 2009–2025 · 10 conferences · across top CS/AI conferences

Achievements

+17 more ↓

🐣 Hot Topic Early Bird 🌍 Conference Polyglot (10) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🏃 Academic Marathon (16)

🌉 Interdisciplinary Bridge 🌍 Conference Polyglot (10) 🧭 Keyword Pioneer 🌟 Keyword Trendsetter Combo (19) 🏠 Conference Loyalist (24) 🤝 Dynamic Duo (39) 🌱 Topic Pioneer 👑 Triple Crown 🔬 Deep Specialist (38) 🏆 Keyword Champion (2) ❓ The Questioner (4) 💎 Century Club (166) 🗃️ Keyword Collector (608) 📈 Trend Setter 🚀 Conference Pioneer 🔥 Unstoppable (14) ⚡ Prolific Year (14)

Conferences

CVPR (67) ICCV (35) NIPS (24) ICLR (19) ECCV (12) ICML (4) EMNLP (2) JMLR (1) SEMEVAL (1) UAI (1)

Top co-authors

Raquel Urtasun (39) Antonio Torralba (24) Jun Gao (22) huan ling (20) Karsten Kreis (20) David Acuna (20) Amlan Kar (17) Seung Wook Kim (16) Zan Gojcic (13) Or Litany (13)

Keywords

semantic segmentation (26) object detection (16) 3d reconstruction (14) autonomous driving (11) generative model (10) markov random field (9) convolutional neural network (8) generative adversarial network (8) diffusion model (8) scene understanding (7) 3d object detection (7) 3d vision (7) differentiable rendering (6) semi-supervised learning (6) instance segmentation (6) video understanding (6) depth estimation (5) object segmentation (5) multimodal learning (5) neural rendering (5)

Papers

Socratic-MCTS: Test-Time Visual Reasoning by Asking the Right Questions EMNLP 2025 PartField: Learning 3D Feature Fields for Part Segmentation and Beyond ICCV 2025 Controllable Weather Synthesis and Removal with Video Diffusion Models ICCV 2025 Diffusion Renderer: Neural Inverse and Forward Rendering with Video Diffusion Models CVPR 2025 DIFIX3D+: Improving 3D Reconstructions with Single-Step Diffusion Models CVPR 2025 Optimizing Data Collection for Machine Learning JMLR 2025 InfiniCube: Unbounded and Controllable Dynamic 3D Driving Scene Generation with World-Guided Video Models ICCV 2025 Can Large Vision-Language Models Correct Semantic Grounding Errors By Themselves? CVPR 2025 ReMatching Dynamic Reconstruction Flow ICLR 2025 OmniRe: Omni Urban Scene Reconstruction ICLR 2025 GEN3C: 3D-Informed World-Consistent Video Generation with Precise Camera Control CVPR 2025 XCube: Large-Scale 3D Generative Modeling using Sparse Voxel Hierarchies CVPR 2024 EmerNeRF: Emergent Spatial-Temporal Scene Decomposition via Self-Supervision ICLR 2024 EmerDiff: Emerging Pixel-level Semantic Knowledge in Diffusion Models ICLR 2024 Align Your Steps: Optimizing Sampling Schedules in Diffusion Models ICML 2024 Transferring Labels to Solve Annotation Mismatches Across Object Detection Datasets ICLR 2024 Trajeglish: Traffic Modeling as Next-Token Prediction ICLR 2024 WildFusion: Learning 3D-Aware Latent Diffusion Models in View Space ICLR 2024 L4GM: Large 4D Gaussian Reconstruction Model NIPS 2024 DistillNeRF: Perceiving 3D Scenes from Single-Glance Images by Distilling Neural Fields and Foundation Model Features NIPS 2024 SCube: Instant Large-Scale Scene Reconstruction using VoxSplats NIPS 2024 Reasoning Paths with Reference Objects Elicit Quantitative Spatial Reasoning in Large Vision-Language Models EMNLP 2024 LATTE3D: Large-scale Amortized Text-To-Enhanced3D Synthesis ECCV 2024 Photorealistic Object Insertion with Diffusion-Guided Inverse Rendering ECCV 2024 NeRF-XL: NeRF at Any Scale with Multi-GPU ECCV 2024 3DiffTection: 3D Object Detection with Geometry-Aware Diffusion Features CVPR 2024 Outdoor Scene Extrapolation with Hierarchical Generative Cellular Automata CVPR 2024 Align Your Gaussians: Text-to-4D with Dynamic 3D Gaussians and Composed Diffusion Models CVPR 2024 Neural LiDAR Fields for Novel View Synthesis ICCV 2023 ATT3D: Amortized Text-to-3D Object Synthesis ICCV 2023 TexFusion: Synthesizing 3D Textures with Text-Guided Image Diffusion Models ICCV 2023 Magic3D: High-Resolution Text-to-3D Content Creation CVPR 2023 Learning Human Dynamics in Autonomous Driving Scenarios ICCV 2023 Align Your Latents: High-Resolution Video Synthesis With Latent Diffusion Models CVPR 2023 Neural Fields Meet Explicit Geometric Representations for Inverse Rendering of Urban Scenes CVPR 2023 NeuralField-LDM: Scene Generation With Hierarchical Latent Diffusion Models CVPR 2023 Trace and Pace: Controllable Pedestrian Animation via Guided Trajectory Diffusion CVPR 2023 VoxFormer: Sparse Voxel Transformer for Camera-Based 3D Semantic Scene Completion CVPR 2023 Neural Kernel Surface Reconstruction CVPR 2023 Towards Viewpoint Robustness in Bird's Eye View Segmentation ICCV 2023 DreamTeacher: Pretraining Image Backbones with Deep Generative Models ICCV 2023 End-to-end 3D Tracking with Decoupled Queries ICCV 2023 LION: Latent Point Diffusion Models for 3D Shape Generation NIPS 2022 Neural Fields As Learnable Kernels for 3D Reconstruction CVPR 2022 Extracting Triangular 3D Models, Materials, and Lighting From Images CVPR 2022 AUV-Net: Learning Aligned UV Maps for Texture Transfer and Synthesis CVPR 2022 MvDeCor: Multi-View Dense Correspondence Learning for Fine-Grained 3D Segmentation ECCV 2022 Neural Light Field Estimation for Street Scenes with Differentiable Virtual Object Insertion ECCV 2022 Generating Useful Accident-Prone Driving Scenarios via a Learned Traffic Prior CVPR 2022 Frame Averaging for Equivariant Shape Space Learning CVPR 2022 BigDatasetGAN: Synthesizing ImageNet With Pixel-Wise Annotations CVPR 2022 Polymorphic-GAN: Generating Aligned Samples Across Multiple Domains With Learned Morph Maps CVPR 2022 EPIC-KITCHENS VISOR Benchmark: VIdeo Segmentations and Object Relations NIPS 2022 Optimizing Data Collection for Machine Learning NIPS 2022 GET3D: A Generative Model of High Quality 3D Textured Shapes Learned from Images NIPS 2022 How Much More Data Do I Need? Estimating Requirements for Downstream Tasks CVPR 2022 Domain Adversarial Training: A Game Perspective ICLR 2022 Low-Budget Active Learning via Wasserstein Distance: An Integer Programming Approach ICLR 2022 NP-DRAW: A Non-Parametric Structured Latent Variable Model for Image Generation UAI 2021 Towards Optimal Strategies for Training Self-Driving Perception Models in Simulation NIPS 2021 Deep Marching Tetrahedra: a Hybrid Representation for High-Resolution 3D Shape Synthesis NIPS 2021 Scalable Neural Data Server: A Data Recommender for Transfer Learning NIPS 2021 ATISS: Autoregressive Transformers for Indoor Scene Synthesis NIPS 2021 Don’t Generate Me: Training Differentially Private Generative Models with Sinkhorn Divergence NIPS 2021 EditGAN: High-Precision Semantic Image Editing NIPS 2021 DIB-R++: Learning to Predict Lighting and Material with a Hybrid Differentiable Renderer NIPS 2021 DriveGAN: Towards a Controllable High-Quality Neural Simulation CVPR 2021 DatasetGAN: Efficient Labeled Data Factory With Minimal Human Effort CVPR 2021 Semantic Segmentation With Generative Models: Semi-Supervised Learning and Strong Out-of-Domain Generalization CVPR 2021 Towards Good Practices for Efficiently Annotating Large-Scale Image Classification Datasets CVPR 2021 Neural Geometric Level of Detail: Real-Time Rendering With Implicit 3D Shapes CVPR 2021 Neural Parts: Learning Expressive 3D Shape Abstractions With Invertible Neural Networks CVPR 2021 Learning Indoor Inverse Rendering With 3D Spatially-Varying Lighting ICCV 2021 Physics-Based Human Motion Estimation and Synthesis From Videos ICCV 2021 3DStyleNet: Creating 3D Shapes With Geometric and Texture Style Variations ICCV 2021 Emergent Road Rules In Multi-Agent Driving Environments ICLR 2021 gradSim: Differentiable simulation for system identification and visuomotor control ICLR 2021 Personalized Federated Learning with First Order Model Optimization ICLR 2021 Image GANs meet Differentiable Rendering for Inverse Graphics and Interpretable 3D Neural Rendering ICLR 2021 Watch-And-Help: A Challenge for Social Perception and Human-AI Collaboration ICLR 2021 f-Domain Adversarial Learning: Theory and Algorithms ICML 2021 Image-Level or Object-Level? A Tale of Two Resampling Strategies for Long-Tailed Detection ICML 2021 Meta-Sim2: Unsupervised Learning of Scene Structure for Synthetic Data Generation ECCV 2020 Learning to Simulate Dynamic Environments With GameGAN CVPR 2020 Neural Data Server: A Large-Scale Search Engine for Transfer Learning Data CVPR 2020 Auto-Tuning Structured Light by Optical Stochastic Gradient Descent CVPR 2020 Learning to Evaluate Perception Models Using Planner-Centric Metrics CVPR 2020 Learning Deformable Tetrahedral Meshes for 3D Reconstruction NIPS 2020 Variational Amodal Object Completion NIPS 2020 Efficient and Information-Preserving Future Frame Prediction and Beyond ICLR 2020 A Theoretical Analysis of the Number of Shots in Few-Shot Learning ICLR 2020 Interactive Annotation of 3D Object Geometry using 2D Scribbles ECCV 2020 Beyond Fixed Grid: Learning Geometric Image Representation with a Deformable Grid ECCV 2020 Expressive Telepresence via Modular Codec Avatars ECCV 2020 ScribbleBox: Interactive Annotation Framework for Video Object Segmentation ECCV 2020 Lift, Splat, Shoot: Encoding Images from Arbitrary Camera Rigs by Implicitly Unprojecting to 3D ECCV 2020 Fast Interactive Object Annotation With Curve-GCN CVPR 2019 DMM-Net: Differentiable Mask-Matching Network for Video Object Segmentation ICCV 2019 Neural Turtle Graphics for Modeling City Road Layouts ICCV 2019 Meta-Sim: Learning to Generate Synthetic Datasets ICCV 2019 Video Face Clustering With Unknown Number of Clusters ICCV 2019 Gated-SCNN: Gated Shape CNNs for Semantic Segmentation ICCV 2019 Learning to Caption Images Through a Lifetime by Asking Questions ICCV 2019 Neural Graph Evolution: Towards Efficient Automatic Robot Design ICLR 2019 Visual Reasoning by Progressive Module Networks ICLR 2019 EigenDamage: Structured Pruning in the Kronecker-Factored Eigenbasis ICML 2019 Learning to Predict 3D Objects with an Interpolation-based Differentiable Renderer NIPS 2019 Devil Is in the Edges: Learning Semantic Boundaries From Noisy Annotations CVPR 2019 Action Recognition From Single Timestamp Supervision in Untrimmed Videos CVPR 2019 Object Instance Annotation With Deep Extreme Level Set Evolution CVPR 2019 DARNet: Deep Active Ray Network for Building Segmentation CVPR 2019 Synthesizing Environment-Aware Activities via Activity Sketches CVPR 2019 Creative Flow+ Dataset CVPR 2019 Scaling Egocentric Vision: The EPIC-KITCHENS Dataset ECCV 2018 NerveNet: Learning Structured Policy with Graph Neural Networks ICLR 2018 A Face-to-Face Neural Conversation Model CVPR 2018 Now You Shake Me: Towards Automatic 4D Cinema CVPR 2018 Efficient Interactive Annotation of Segmentation Datasets With Polygon-RNN++ CVPR 2018 Learning to Act Properly: Predicting and Explaining Affordances From Images CVPR 2018 SurfConv: Bridging 3D and 2D Convolution for RGBD Images CVPR 2018 MovieGraphs: Towards Understanding Human-Centric Situations From Videos CVPR 2018 A Neural Compositional Paradigm for Image Captioning NIPS 2018 VirtualHome: Simulating Household Activities via Programs CVPR 2018 SGN: Sequential Grouping Networks for Instance Segmentation ICCV 2017 Situation Recognition With Graph Neural Networks ICCV 2017 3D Graph Neural Networks for RGBD Semantic Segmentation ICCV 2017 Sports Field Localization via Deep Structured Models CVPR 2017 Teaching Machines to Describe Images with Natural Language Feedback NIPS 2017 Scene Parsing Through ADE20K Dataset CVPR 2017 Be Your Own Prada: Fashion Synthesis With Structural Coherence ICCV 2017 Open Vocabulary Scene Parsing ICCV 2017 Towards Diverse and Natural Image Descriptions via a Conditional GAN ICCV 2017 TorontoCity: Seeing the World With a Million Eyes ICCV 2017 Annotating Object Instances With a Polygon-RNN CVPR 2017 Monocular 3D Object Detection for Autonomous Driving CVPR 2016 MovieQA: Understanding Stories in Movies Through Question-Answering CVPR 2016 Proximal Deep Structured Models NIPS 2016 Instance-Level Segmentation for Autonomous Driving With Deep Densely Connected MRFs CVPR 2016 HD Maps: Fine-Grained Road Segmentation by Parsing Ground and Aerial Images CVPR 2016 Lost Shopping! Monocular Localization in Large Indoor Spaces ICCV 2015 Skip-Thought Vectors NIPS 2015 Holistic 3D Scene Understanding From a Single Geo-Tagged Image CVPR 2015 Rent3D: Floor-Plan Priors for Monocular Layout Estimation CVPR 2015 Real-Time Coarse-to-Fine Topologically Preserving Segmentation CVPR 2015 Neuroaesthetics in Fashion: Modeling the Perception of Fashionability CVPR 2015 3D Object Proposals for Accurate Object Class Detection NIPS 2015 segDeepM: Exploiting Segmentation and Context in Deep Neural Networks for Object Detection CVPR 2015 Aligning Books and Movies: Towards Story-Like Visual Explanations by Watching Movies and Reading Books ICCV 2015 Learning to Combine Mid-Level Cues for Object Proposal Generation ICCV 2015 Enhancing Road Maps by Parsing Aerial Images Around the World ICCV 2015 Monocular Object Instance Segmentation and Depth Ordering With CNNs ICCV 2015 Predicting Deep Zero-Shot Convolutional Neural Networks Using Textual Descriptions ICCV 2015 Visual Semantic Search: Retrieving Videos via Complex Textual Queries CVPR 2014 Beat the MTurkers: Automatic Image Labeling from Weak 3D Supervision CVPR 2014 What are You Talking About? Text-to-Image Coreference CVPR 2014 The Role of Context for Object Detection and Semantic Segmentation in the Wild CVPR 2014 Detect What You Can: Detecting and Representing Objects using Holistic Models and Body Parts CVPR 2014 Detecting Curved Symmetric Parts Using a Deformable Disc Model ICCV 2013 Holistic Scene Understanding for 3D Object Detection with RGBD Cameras ICCV 2013 Analyzing Semantic Segmentation Using Hybrid Human-Machine CRFs CVPR 2013 Box in the Box: Joint 3D Layout and Object Reasoning from Single Images ICCV 2013 Bottom-Up Segmentation for Top-Down Detection CVPR 2013 A Sentence Is Worth a Thousand Pixels CVPR 2013 Unsupervised Disambiguation of Image Captions SEMEVAL 2012 3D Object Detection and Viewpoint Estimation with a Deformable 3D Cuboid Model NIPS 2012 Evaluating multi-class learning strategies in a generative hierarchical framework for object detection NIPS 2009