Yilun Du

82 papers · 2018–2026 · 10 conferences · across top CS/AI conferences

Achievements

+15 more ↓

🧭 Keyword Pioneer 🌍 Conference Polyglot (8) 🌉 Interdisciplinary Bridge 🌈 Renaissance Researcher (5) 🏃 Academic Marathon (7)

🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🐣 Hot Topic Early Bird 🏠 Conference Loyalist (22) 🤝 Dynamic Duo (23) 👑 Triple Crown 👥 Mega-Team (34) 🏆 Grand Slam 🧬 Topic Evolution 🏆 Keyword Champion (2) ❓ The Questioner 🔥 Unstoppable (8) 🗃️ Keyword Collector (225) 💎 Century Club (80) ⚡ Prolific Year (22)

Conferences

ICLR (22) ICML (18) NIPS (18) ICCV (6) CORL (5) RSS (5) CVPR (4) ECCV (2) AAAI (1) ACL (1)

Top co-authors

Joshua B. Tenenbaum (23) Chuang Gan (15) Shuang Li (13) Antonio Torralba (10) Igor Mordatch (10) Josh Tenenbaum (9) Vincent Sitzmann (7) Joshua Tenenbaum (6) Leslie Pack Kaelbling (6) Hongxin Zhang (5)

Keywords

diffusion model (11) energy-based model (7) generative model (5) large language model (4) representation learning (4) robot manipulation (3) neural field (3) pose estimation (3) video generation (3) image generation (3) point cloud (3) 3d reconstruction (2) concept learning (2) inverse dynamics (2) scene representation (2) embodied ai (2) markov chain monte carlo (2) semantic segmentation (2) robotic manipulation (2) self-supervised learning (2)

Papers

Towards Generalist Robot Learning from Internet Video: A Survey (Abstract Reprint) AAAI 2026 Large Language Models Are Bad Dice Players: LLMs Struggle to Generate Random Numbers from Statistical Distributions ACL 2026 Solving New Tasks by Adapting Internet Video Knowledge ICLR 2025 Compositional Scene Understanding through Inverse Generative Modeling ICML 2025 History-Guided Video Diffusion ICML 2025 Multiagent Finetuning: Self Improvement with Diverse Reasoning Chains ICLR 2025 Grounding Video Models to Actions through Goal Conditioned Exploration ICLR 2025 AdaWorld: Learning Adaptable World Models with Latent Actions ICML 2025 3D-Mem: 3D Scene Memory for Embodied Exploration and Reasoning CVPR 2025 COMBO: Compositional World Models for Embodied Multi-Agent Cooperation ICLR 2025 AuroraCap: Efficient, Performant Video Detailed Captioning and a New Benchmark ICLR 2025 Looped Transformers for Length Generalization ICLR 2025 Learning 4D Embodied World Models ICCV 2025 Learning Interactive Real-World Simulators ICLR 2024 Compositional Generative Inverse Design ICLR 2024 Position: Compositional Generative Modeling: A Single Model is Not All You Need ICML 2024 Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion NIPS 2024 Few-Shot Task Learning through Inverse Generative Modeling NIPS 2024 3D-VLA: A 3D Vision-Language-Action Generative World Model ICML 2024 Position: Video as the New Language for Real-World Decision Making ICML 2024 Compositional Image Decomposition with Diffusion Models ICML 2024 Potential Based Diffusion Motion Planning ICML 2024 RoboDreamer: Learning Compositional World Models for Robot Imagination ICML 2024 Learning Iterative Reasoning through Energy Diffusion ICML 2024 Learning to Act from Actionless Videos through Dense Correspondences ICLR 2024 Set It Up!: Functional Object Arrangement with Compositional Generative Models RSS 2024 PoCo: Policy Composition from and for Heterogeneous Robot Learning RSS 2024 Improving Factuality and Reasoning in Language Models through Multiagent Debate ICML 2024 Large-scale Reinforcement Learning for Diffusion Models ECCV 2024 Training Diffusion Models with Reinforcement Learning ICLR 2024 Building Cooperative Embodied Agents Modularly with Large Language Models ICLR 2024 Video Language Planning ICLR 2024 HAZARD Challenge: Embodied Decision Making in Dynamically Changing Environments ICLR 2024 Probabilistic Adaptation of Black-Box Text-to-Video Models ICLR 2024 Learning to Jointly Understand Visual and Tactile Signals ICLR 2024 Is Conditional Generative Modeling all you need for Decision Making? ICLR 2023 FlowCam: Training Generalizable 3D Radiance Fields without Camera Poses via Pixel-Aligned Scene Flow NIPS 2023 Learning Universal Policies via Text-Guided Video Generation NIPS 2023 3D-LLM: Injecting the 3D World into Large Language Models NIPS 2023 Compositional Foundation Models for Hierarchical Planning NIPS 2023 Adaptive Online Replanning with Diffusion Models NIPS 2023 DiffuseBot: Breeding Soft Robots With Physics-Augmented Generative Diffusion Models NIPS 2023 Secure Out-of-Distribution Task Generalization with Energy-Based Models NIPS 2023 Compositional Diffusion-Based Continuous Constraint Solvers CORL 2023 Learning To Render Novel Views From Wide-Baseline Stereo Pairs CVPR 2023 3D Concept Learning and Reasoning From Multi-View Images CVPR 2023 Unsupervised Compositional Concepts Discovery with Text-to-Image Generative Models ICCV 2023 Planning with Sequence Models through Iterative Energy Minimization ICLR 2023 Composing Ensembles of Pre-trained Models via Iterative Consensus ICLR 2023 Neural Groundplans: Persistent Neural Scene Representations from a Single Image ICLR 2023 Inferring Relational Potentials in Interacting Systems ICML 2023 Reduce, Reuse, Recycle: Compositional Generation with Energy-Based Diffusion Models and MCMC ICML 2023 Diffusion Policy: Visuomotor Policy Learning via Action Diffusion RSS 2023 StructDiffusion: Language-Guided Creation of Physically-Valid Structures using Unseen Objects RSS 2023 NeuSE: Neural SE(3)-Equivariant Embedding for Consistent Spatial Understanding with Objects RSS 2023 Compositional Visual Generation with Composable Diffusion Models ECCV 2022 Kubric: A Scalable Dataset Generator CVPR 2022 MIRA: Mental Imagery for Robotic Affordances CORL 2022 SE(3)-Equivariant Relational Rearrangement with Neural Descriptor Fields CORL 2022 Pre-Trained Language Models for Interactive Decision-Making NIPS 2022 Planning with Diffusion for Flexible Behavior Synthesis ICML 2022 Streaming Inference for Infinite Feature Models ICML 2022 3D Concept Grounding on Neural Fields NIPS 2022 Learning Neural Acoustic Fields NIPS 2022 Learning Iterative Reasoning through Energy Minimization ICML 2022 Curious Representation Learning for Embodied Intelligence ICCV 2021 3D Shape Generation and Completion Through Point-Voxel Diffusion ICCV 2021 Improved Contrastive Divergence Training of Energy-Based Models ICML 2021 Unsupervised Learning of Compositional Energy Concepts NIPS 2021 Neural Radiance Flow for 4D View Synthesis and Video Processing ICCV 2021 Learning to Compose Visual Relations NIPS 2021 Learning Signal-Agnostic Manifolds of Neural Fields NIPS 2021 Unsupervised Discovery of 3D Physical Objects from Video ICLR 2021 Weakly Supervised Human-Object Interaction Detection in Video via Contrastive Spatiotemporal Regions ICCV 2021 Observational Overfitting in Reinforcement Learning ICLR 2020 A Long Horizon Planning Framework for Manipulating Rigid Pointcloud Objects CORL 2020 Compositional Visual Generation with Energy Based Models NIPS 2020 Energy-based models for atomic-resolution protein conformations ICLR 2020 Task-Agnostic Dynamics Priors for Deep Reinforcement Learning ICML 2019 Model-Based Planning with Energy-Based Models CORL 2019 Implicit Generation and Modeling with Energy Based Models NIPS 2019 Learning to Exploit Stability for 3D Scene Parsing NIPS 2018