Alan Yuille

144 papers · 2013–2026 · 17 conferences · across top CS/AI conferences

Achievements

+17 more ↓

🏃 Academic Marathon (13) 🌍 Conference Polyglot (17) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🐝 Cross-Pollinator (10)

🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🏃 Academic Marathon (13) 🌟 Keyword Trendsetter Combo (4) 🏠 Conference Loyalist (23) 🤝 Dynamic Duo (29) 👥 Mega-Team (25) 🔬 Deep Specialist (17) 👑 Triple Crown 🏆 Keyword Champion (3) 🏆 Grand Slam ❓ The Questioner (3) 💎 Century Club (144) 🔥 Unstoppable (7) 🗃️ Keyword Collector (399) 🚀 Conference Pioneer ⚡ Prolific Year (16)

Conferences

CVPR (38) ECCV (27) ICCV (23) ICLR (18) WACV (12) AAAI (6) ICML (5) NIPS (3) IJCAI (2) MICCAI (2) MIDL (2) EMNLP (1) EACL (1) CORL (1) JMLR (1) NAACL (1) RSS (1)

Top co-authors

Adam Kortylewski (29) Cihang Xie (23) Angtian Wang (19) Zongwei Zhou (15) Wufei Ma (14) Siyuan Qiao (13) Huiyu Wang (13) Wei Shen (12) Qihang Yu (12) Jieru Mei (11)

Research topics

Techniques (1)

Keywords

object detection (15) semantic segmentation (12) 3d reconstruction (10) contrastive learning (8) medical imaging (7) instance segmentation (7) diffusion model (6) pose estimation (6) 3d vision (6) image classification (6) transfer learning (6) visual question answering (5) vision transformer (5) vision-language model (5) self-supervised learning (4) representation learning (4) convolutional neural network (4) autonomous driving (4) video understanding (4) generative model (4)

Papers

4D-Animal: Freely Reconstructing Animatable 3D Animals from Videos WACV 2026 Compositional 4D Dynamic Scenes Understanding with Physics Priors for Video Question Answering ICLR 2025 GenEx: Generating an Explorable World ICLR 2025 Autoregressive Pretraining with Mamba in Vision ICLR 2025 EasyRet3D: Uncalibrated Multi-View Multi-Human 3D Reconstruction and Tracking WACV 2025 PartInstruct: Part-level Instruction Following for Fine-grained Robot Manipulation RSS 2025 Scaling Laws in Patchification: An Image Is Worth 50,176 Tokens And More ICML 2025 FlowAR: Scale-wise Autoregressive Image Generation Meets Flow Matching ICML 2025 Medical World Model ICCV 2025 Beyond Next-Token: Next-X Prediction for Autoregressive Visual Generation ICCV 2025 3DSRBench: A Comprehensive 3D Spatial Reasoning Benchmark ICCV 2025 VideoAuteur: Towards Long Narrative Video Generation ICCV 2025 RadGPT: Constructing 3D Image-Text Tumor Datasets ICCV 2025 Scaling Tumor Segmentation: Best Lessons from Real and Synthetic Data ICCV 2025 Mamba-Reg: Vision Mamba Also Needs Registers CVPR 2025 Flowing from Words to Pixels: A Noise-Free Framework for Cross-Modality Evolution CVPR 2025 Adventurer: Optimizing Vision Mamba Architecture Designs for Efficiency CVPR 2025 SpatialLLM: A Compound 3D-Informed Design towards Spatially-Intelligent Large Multimodal Models CVPR 2025 Spatial457: A Diagnostic Benchmark for 6D Spatial Reasoning of Large Mutimodal Models CVPR 2025 Scaling 3D Compositional Models for Robust Classification and Pose Estimation ICCV 2025 Baking Gaussian Splatting into Diffusion Denoiser for Fast and Scalable Single-stage Image-to-3D Generation and Reconstruction ICCV 2025 Structure-Aware Sparse-View X-ray 3D Reconstruction CVPR 2024 DIRECT-3D: Learning Direct Text-to-3D Generation on Massive Noisy 3D Data CVPR 2024 Towards Generalizable Tumor Synthesis CVPR 2024 A Bayesian Approach to OOD Robustness in Image Classification CVPR 2024 Causal-CoG: A Causal-Effect Look at Context Generation for Boosting Multi-modal Language Models CVPR 2024 De-Diffusion Makes Text a Strong Cross-Modal Interface CVPR 2024 Localization vs. Semantics: Visual Representations in Unimodal and Multimodal Models EACL 2024 Radiative Gaussian Splatting for Efficient X-ray Novel View Synthesis ECCV 2024 NOVUM: Neural Object Volumes for Robust Object Classification ECCV 2024 Efficient Large Multi-modal Models via Visual Context Compression NIPS 2024 ImageNet3D: Towards General-Purpose Object-Level 3D Understanding NIPS 2024 Neural Textured Deformable Meshes for Robust Analysis-by-Synthesis WACV 2024 Robust Category-Level 3D Pose Estimation From Diffusion-Enhanced Synthetic Data WACV 2024 From Pixel to Cancer: Cellular Automata in Computed Tomography MICCAI 2024 HISR: Hybrid Implicit Surface Representation for Photorealistic 3D Human Reconstruction AAAI 2024 Embracing Massive Medical Data MICCAI 2024 Rejuvenating image-GPT as Strong Visual Representation Learners ICML 2024 How Well Do Supervised 3D Models Transfer to Medical Imaging Tasks? ICLR 2024 Discovering Failure Modes of Text-guided Diffusion Models via Adversarial Search ICLR 2024 Generating Images with 3D Annotations Using Diffusion Models ICLR 2024 Source-Free and Image-Only Unsupervised Domain Adaptation for Category Level Object Pose Estimation ICLR 2024 HDR-GS: Efficient High Dynamic Range Novel View Synthesis at 1000x Speed via Gaussian Splatting NIPS 2024 iNeMo: Incremental Neural Mesh Models for Robust Class-Incremental Learning ECCV 2024 From Pixels to Objects: A Hierarchical Approach for Part and Object Segmentation Using Local and Global Aggregation ECCV 2024 IG Captioner: Information Gain Captioners are Strong Zero-shot Classifiers ECCV 2024 A Semantic Space is Worth 256 Language Descriptions: Make Stronger Segmentation Models with Descriptive Properties ECCV 2024 SCLIP: Rethinking Self-Attention for Dense Vision-Language Inference ECCV 2024 Rethinking Video-Text Understanding: Retrieval from Counterfactually Augmented Data ECCV 2024 ViTamin: Designing Scalable Vision Models in the Vision-Language Era CVPR 2024 CancerUniT: Towards a Single Unified Model for Effective Detection, Segmentation, and Diagnosis of Eight Major Cancers Using a Large Collection of CT Scans ICCV 2023 3D-Aware Neural Body Fitting for Occlusion Robust 3D Human Pose Estimation ICCV 2023 CLIP-Driven Universal Model for Organ Segmentation and Tumor Detection ICCV 2023 Animal3D: A Comprehensive Dataset of 3D Animal Pose and Shape ICCV 2023 Diffusion Models as Masked Autoencoders ICCV 2023 SMAUG: Sparse Masked Autoencoder for Efficient Video-Language Pre-Training ICCV 2023 MOAT: Alternating Mobile Convolution and Attention Brings Strong Vision Models ICLR 2023 Which Layer is Learning Faster? A Systematic Exploration of Layer-wise Convergence Rate for Deep Neural Networks ICLR 2023 VoGE: A Differentiable Volume Renderer using Gaussian Ellipsoids for Analysis-by-Synthesis ICLR 2023 Making Your First Choice: To Address Cold Start Problem in Medical Active Learning MIDL 2023 CoKe: Contrastive Learning for Robust Keypoint Detection WACV 2023 CORL: Compositional Representation Learning for Few-Shot Classification WACV 2023 Delving Into Masked Autoencoders for Multi-Label Thorax Disease Classification WACV 2023 Learning From Temporal Gradient for Semi-Supervised Action Recognition CVPR 2022 CMT-DeepLab: Clustering Mask Transformers for Panoptic Segmentation CVPR 2022 Masked Feature Prediction for Self-Supervised Visual Pre-Training CVPR 2022 A Simple Data Mixing Prior for Improving Self-Supervised Learning CVPR 2022 Context-Enhanced Stereo Transformer ECCV 2022 In Defense of Image Pre-training for Spatiotemporal Recognition ECCV 2022 Robust Category-Level 6D Pose Estimation with Coarse-to-Fine Rendering of Neural Features ECCV 2022 TransMix: Attend To Mix for Vision Transformers CVPR 2022 DeepFusion: Lidar-Camera Deep Fusion for Multi-Modal 3D Object Detection CVPR 2022 Lite Vision Transformer With Enhanced Self-Attention CVPR 2022 Simulated Adversarial Testing of Face Recognition Models CVPR 2022 Point-Level Region Contrast for Object Detection Pre-Training CVPR 2022 Learning Part Segmentation Through Unsupervised Domain Adaptation From Synthetic Vehicles CVPR 2022 OOD-CV: A Benchmark for Robustness to Out-of-Distribution Shifts of Individual Nuisances in Natural Images ECCV 2022 "PartImageNet: A Large, High-Quality Dataset of Parts" ECCV 2022 Explicit Occlusion Reasoning for Multi-Person 3D Human Pose Estimation ECCV 2022 Unsupervised Domain Adaptation through Shape Modeling for Medical Image Segmentation MIDL 2022 In Defense of Online Models for Video Instance Segmentation ECCV 2022 Learning Road Scene-level Representations via Semantic Region Prediction CORL 2022 Image BERT Pre-training with Online Tokenizer ICLR 2022 k-Means Mask Transformer ECCV 2022 CP2: Copy-Paste Contrastive Pretraining for Semantic Segmentation ECCV 2022 Coarse-to-Fine Incremental Few-Shot Learning ECCV 2022 Fast AdvProp ICLR 2022 SwapMix: Diagnosing and Regularizing the Over-Reliance on Visual Context in Visual Question Answering CVPR 2022 Amodal Segmentation Through Out-of-Task and Out-of-Distribution Generalization With a Bayesian Model CVPR 2022 Progressive Stage-Wise Learning for Unsupervised Feature Representation Enhancement CVPR 2021 Exploring Simple 3D Multi-Object Tracking for Autonomous Driving ICCV 2021 Calibrating Concepts and Operations: Towards Symbolic Reasoning on Real Images ICCV 2021 A-SDF: Learning Disentangled Signed Distance Functions for Articulated Shape Representation ICCV 2021 Weakly Supervised Instance Segmentation for Videos With Temporal Mask Consistency CVPR 2021 MaX-DeepLab: End-to-End Panoptic Segmentation With Mask Transformers CVPR 2021 Deeply Shape-Guided Cascade for Instance Segmentation CVPR 2021 NeMo: Neural Mesh Models of Contrastive Features for Robust 3D Pose Estimation ICLR 2021 CO2: Consistent Contrast for Unsupervised Visual Representation Learning ICLR 2021 Shape-Texture Debiased Neural Network Training ICLR 2021 Robust Instance Segmentation Through Reasoning About Multi-Object Occlusion CVPR 2021 DetectoRS: Detecting Objects With Recursive Feature Pyramid and Switchable Atrous Convolution CVPR 2021 CReST: A Class-Rebalancing Self-Training Framework for Imbalanced Semi-Supervised Learning CVPR 2021 VIP-DeepLab: Learning Visual Perception With Depth-Aware Video Panoptic Segmentation CVPR 2021 Self-Supervised Pillar Motion Learning for Autonomous Driving CVPR 2021 Mask Guided Matting via Progressive Refinement Network CVPR 2021 CAKES: Channel-wise Automatic KErnel Shrinking for Efficient 3D Networks AAAI 2021 DASZL: Dynamic Action Signatures for Zero-shot Learning AAAI 2021 When Radiology Report Generation Meets Knowledge Graph AAAI 2020 Resisting Large Data Variations via Introspective Transformation Network WACV 2020 Robust Face Detection via Learning Small Faces on Hard Images WACV 2020 Combining Compositional Models and Deep Networks For Robust Object Classification under Occlusion WACV 2020 Axial-DeepLab: Stand-Alone Axial-Attention for Panoptic Segmentation ECCV 2020 Object as Hotspots: An Anchor-Free 3D Object Detection Approach via Firing of Hotspots ECCV 2020 Learning Transferable Adversarial Examples via Ghost Networks AAAI 2020 Intriguing Properties of Adversarial Training at Scale ICLR 2020 AtomNAS: Fine-Grained End-to-End Neural Architecture Search ICLR 2020 PatchAttack: A Black-box Texture-based Attack with Reinforcement Learning ECCV 2020 3D semi-supervised learning with uncertainty-aware multi-view co-training WACV 2020 Identifying Model Weakness with Adversarial Examiner AAAI 2020 JSSR: A Joint Synthesis, Segmentation, and Registration System for 3D Multi-Modal Image Alignment of Large-scale Pathological CT Scans ECCV 2020 Adversarial Examples for Edge Detection: They Exist, and they Transfer WACV 2020 Regional Homogeneity: Towards Learning Transferable Universal Adversarial Perturbations Against Defenses ECCV 2020 Are Labels Necessary for Neural Architecture Search? ECCV 2020 Scene Graph Parsing as Dependency Parsing NAACL 2018 Progressive Neural Architecture Search ECCV 2018 Weakly Supervised Region Proposal Network and Object Detection ECCV 2018 Mitigating Adversarial Effects Through Randomization ICLR 2018 Deep Co-Training for Semi-Supervised Image Recognition ECCV 2018 Gradually Updated Neural Networks for Large-Scale Image Recognition ICML 2018 PreCo: A Large-scale Dataset in Preschool Vocabulary for Coreference Resolution EMNLP 2018 Multi-Stage Multi-Recursive-Input Fully Convolutional Networks for Neuronal Boundary Detection ICCV 2017 ScaleNet: Guiding Object Proposal Generation in Supermarkets and Beyond ICCV 2017 Genetic CNN ICCV 2017 SORT: Second-Order Response Transform for Visual Recognition ICCV 2017 Recurrent Multimodal Interaction for Referring Image Segmentation ICCV 2017 Object Recognition with and without Objects IJCAI 2017 MAT: A Multimodal Attentive Translator for Image Captioning IJCAI 2017 Adversarial Examples for Semantic Segmentation and Object Detection ICCV 2017 Complexity of Representation and Inference in Compositional Models with Part Sharing JMLR 2016 Learning Deep Structured Models ICML 2015 The Role of Context for Object Detection and Semantic Segmentation in the Wild CVPR 2014 Detect What You Can: Detecting and Representing Objects using Holistic Models and Body Parts CVPR 2014 Bottom-Up Segmentation for Top-Down Detection CVPR 2013 Boundary Detection Benchmarking: Beyond F-Measures CVPR 2013