conftrace_

Sergey Levine

362 papers · 2010–2026 · 14 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓

+21 more ↓

🧭 Keyword Pioneer 🗺️ Taxonomy Completionist (44) 🌈 Renaissance Researcher (7) 🌉 Interdisciplinary Bridge 🐣 Hot Topic Early Bird

🧭 Keyword Pioneer 🌈 Renaissance Researcher (7) 🗺️ Taxonomy Completionist (44) 🌟 Keyword Trendsetter Combo (15) 🏠 Conference Loyalist (82) 👑 Domain Dominant (52) 🧬 Topic Evolution 🔬 Deep Specialist (20) 🌱 Topic Pioneer 🏆 Keyword Champion (8) 🤝 Dynamic Duo (90) 👑 Triple Crown 👥 Mega-Team (98) 🏆 Grand Slam ⚡ Prolific Year (46) ❓ The Questioner (9) 🔥 Unstoppable (13) 🗃️ Keyword Collector (213) 🚀 Conference Pioneer 💎 Century Club (361) 📈 Trend Setter

Conferences

ICLR (90) NIPS (82) ICML (75) CORL (67) RSS (30) CVPR (4) NAACL (4) ICCV (3) L4DC (2) AAAI (1) AISTATS (1) ECCV (1) IJCAI (1) JMLR (1)

Top co-authors

Chelsea Finn (90) Aviral Kumar (45) Pieter Abbeel (36) Karol Hausman (29) Abhishek Gupta (27) Benjamin Eysenbach (27) Ted Xiao (18) Quan Vuong (17) Karl Pertsch (17) Yevgen Chebotar (16)

Research topics

Reinforcement Learning (4) Applications (1) Robotics (1) Systems (1)

Keywords

reinforcement learning (54) offline reinforcement learning (30) deep reinforcement learning (26) imitation learning (26) representation learning (24) robotic manipulation (22) model-based reinforcement learning (16) sample efficiency (16) multi-task learning (12) distribution shift (12) policy learning (12) continuous control (11) off-policy learning (11) self-supervised learning (10) policy optimization (10) robot manipulation (10) goal-conditioned policy (10) few-shot learning (9) reward function (9) value function (9)

Papers

Cliqueformer: Model-Based Optimization with Structured Transformers AAAI 2026 Proposer-Agent-Evaluator (PAE): Autonomous Skill Discovery For Foundation Model Internet Agents ICML 2025 Leveraging Skills from Unlabeled Prior Data for Efficient Online Exploration ICML 2025 Behavioral Exploration: Learning to Explore via In-Context Adaptation ICML 2025 Reward-Guided Iterative Refinement in Diffusion Models at Test-Time with Applications to Protein and DNA Design ICML 2025 Hi Robot: Open-Ended Instruction Following with Hierarchical Vision-Language-Action Models ICML 2025 Scaling Test-Time Compute Without Verification or RL is Suboptimal ICML 2025 Value-Based Deep RL Scales Predictably ICML 2025 Reflective Planning: Vision-Language Models for Multi-Stage Long-Horizon Robotic Manipulation CORL 2025 AutoEval: Autonomous Evaluation of Generalist Robot Manipulation Policies in the Real World CORL 2025 Training Strategies for Efficient Embodied Reasoning CORL 2025 RoboArena: Distributed Real-World Evaluation of Generalist Robot Policies CORL 2025 Steering Your Diffusion Policy with Latent Space Reinforcement Learning CORL 2025 $\pi_0.5$: a Vision-Language-Action Model with Open-World Generalization CORL 2025 Flow Q-Learning ICML 2025 What Do Learning Dynamics Reveal About Generalization in LLM Mathematical Reasoning? ICML 2025 Adding Conditional Control to Diffusion Models with Reinforcement Learning ICLR 2025 Prioritized Generative Replay ICLR 2025 Digi-Q: Learning VLM Q-Value Functions for Training Device-Control Agents ICLR 2025 OGBench: Benchmarking Offline Goal-Conditioned RL ICLR 2025 One Step Diffusion via Shortcut Models ICLR 2025 Language Guided Skill Discovery ICLR 2025 Fine-Tuning Discrete Diffusion Models via Reward Optimization with Applications to DNA and Protein Design ICLR 2025 Q-SFT: Q-Learning for Language Models via Supervised Fine-Tuning ICLR 2025 RLDG: Robotic Generalist Policy Distillation via Reinforcement Learning RSS 2025 FAST: Efficient Action Tokenization for Vision-Language-Action Models RSS 2025 π₀: A Vision-Language-Action Flow Model for General Robot Control RSS 2025 Efficient Online Reinforcement Learning Fine-Tuning Need Not Retain Offline Data ICLR 2025 SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training ICML 2025 Unfamiliar Finetuning Examples Control How Language Models Hallucinate NAACL 2025 LMRL Gym: Benchmarks for Multi-Turn Reinforcement Learning with Language Models ICML 2025 Steering Your Generalists: Improving Robotic Foundation Models via Value Guidance CORL 2024 Autonomous Improvement of Instruction Following Skills via Foundation Models CORL 2024 Mobility VLA: Multimodal Instruction Navigation with Long-Context VLMs and Topological Graphs CORL 2024 Evaluating Real-World Robot Manipulation Policies in Simulation CORL 2024 Robotic Control via Embodied Chain-of-Thought Reasoning CORL 2024 OpenVLA: An Open-Source Vision-Language-Action Model CORL 2024 Policy Adaptation via Language Optimization: Decomposing Tasks for Few-Shot Imitation CORL 2024 Lifelong Autonomous Improvement of Navigation Foundation Models in the Wild CORL 2024 LeLaN: Learning A Language-Conditioned Navigation Policy from In-the-Wild Video CORL 2024 Scaling Cross-Embodied Learning: One Policy for Manipulation, Navigation, Locomotion and Aviation CORL 2024 DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset RSS 2024 Pushing the Limits of Cross-Embodiment Learning for Manipulation and Navigation RSS 2024 DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement Learning NIPS 2024 Octo: An Open-Source Generalist Robot Policy RSS 2024 Inference via Interpolation: Contrastive Representations Provably Enable Planning and Inference NIPS 2024 Learning to Assist Humans without Inferring Rewards NIPS 2024 Is Value Learning Really the Main Bottleneck in Offline RL? NIPS 2024 Designing Cell-Type-Specific Promoter Sequences Using Conservative Model-Based Optimization NIPS 2024 Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learning NIPS 2024 Bridging Model-Based Optimization and Generative Modeling via Conservative Fine-Tuning of Diffusion Models NIPS 2024 SELFI: Autonomous Self-Improvement with RL for Vision-Based Navigation around People CORL 2024 RACER: Epistemic Risk-Sensitive RL Enables Fast Driving with Fewer Crashes RSS 2024 MOKA: Open-World Robotic Manipulation through Mark-Based Visual Prompting RSS 2024 Yell At Your Robot: Improving On-the-Fly from Language Corrections RSS 2024 ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL ICML 2024 Learning to Explore in POMDPs with Informational Rewards ICML 2024 Feedback Efficient Online Fine-Tuning of Diffusion Models ICML 2024 Prompting is a Double-Edged Sword: Improving Worst-Group Robustness of Foundation Models ICML 2024 Foundation Policies with Hilbert Representations ICML 2024 PIVOT: Iterative Visual Prompting Elicits Actionable Knowledge for VLMs ICML 2024 Learning Temporal Distances: Contrastive Successor Features Can Provide a Metric Structure for Decision-Making ICML 2024 Chain of Code: Reasoning with a Language Model-Augmented Code Emulator ICML 2024 Unsupervised Zero-Shot Reinforcement Learning via Functional Reward Encodings ICML 2024 Stop Regressing: Training Value Functions via Classification for Scalable Deep RL ICML 2024 Project and Probe: Sample-Efficient Adaptation by Interpolating Orthogonal Features ICLR 2024 Deep Neural Networks Tend To Extrapolate Predictably ICLR 2024 METRA: Scalable Unsupervised RL with Metric-Aware Abstraction ICLR 2024 The False Promise of Imitating Proprietary Language Models ICLR 2024 RLIF: Interactive Imitation Learning as Reinforcement Learning ICLR 2024 Stabilizing Contrastive RL: Techniques for Robotic Goal Reaching from Offline Data ICLR 2024 Offline RL with Observation Histories: Analyzing and Improving Sample Complexity ICLR 2024 Zero-Shot Robotic Manipulation with Pre-Trained Image-Editing Diffusion Models ICLR 2024 Training Diffusion Models with Reinforcement Learning ICLR 2024 Functional Graphical Models: Structure Enables Offline Data-Driven Optimization AISTATS 2024 Offline RL for Natural Language Generation with Implicit Language Q Learning ICLR 2023 ReDS: Offline RL With Heteroskedastic Datasets via Support Constraints NIPS 2023 HIQL: Offline Goal-Conditioned RL with Latent States as Actions NIPS 2023 Learning to Influence Human Behavior with Offline Reinforcement Learning NIPS 2023 Ignorance is Bliss: Robust Control via Information Gating NIPS 2023 Grounded Decoding: Guiding Text Generation with Grounded Models for Embodied Agents NIPS 2023 Cal-QL: Calibrated Offline RL Pre-Training for Efficient Online Fine-Tuning NIPS 2023 Accelerating Exploration with Unlabeled Prior Data NIPS 2023 ViNT: A Foundation Model for Visual Navigation CORL 2023 Action-Quantized Offline Reinforcement Learning for Robotic Skill Learning CORL 2023 BridgeData V2: A Dataset for Robot Learning at Scale CORL 2023 REBOOT: Reuse Data for Bootstrapping Efficient Real-World Dexterous Manipulation CORL 2023 RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control CORL 2023 Navigation with Large Language Models: Semantic Guesswork as a Heuristic for Planning CORL 2023 FastRLAP: A System for Learning High-Speed Driving via Deep RL and Autonomous Practicing CORL 2023 Goal Representations for Instruction Following: A Semi-Supervised Language Interface to Control CORL 2023 Q-Transformer: Scalable Offline Reinforcement Learning via Autoregressive Q-Functions CORL 2023 Hierarchical Abstraction for Combinatorial Generalization in Object Rearrangement ICLR 2023 Bitrate-Constrained DRO: Beyond Worst Case Robustness To Unknown Group Shifts ICLR 2023 Efficient Deep Reinforcement Learning Requires Regulating Overfitting ICLR 2023 Simplifying Model-based RL: Learning Representations, Latent-space Models, and Policies with One Objective ICLR 2023 Offline Q-learning on Diverse Multi-Task Data Both Scales And Generalizes ICLR 2023 Confidence-Conditioned Value Functions for Offline Reinforcement Learning ICLR 2023 Efficient Online Reinforcement Learning with Offline Data ICML 2023 PaLM-E: An Embodied Multimodal Language Model ICML 2023 A Connection between One-Step RL and Critic Regularization in Reinforcement Learning ICML 2023 Reinforcement Learning from Passive Data via Latent Intentions ICML 2023 Understanding the Complexity Gains of Single-Task RL with a Curriculum ICML 2023 Predictable MDP Abstraction for Unsupervised Model-Based RL ICML 2023 Jump-Start Reinforcement Learning ICML 2023 Adversarial Policies Beat Superhuman Go AIs ICML 2023 Contrastive Example-Based Control L4DC 2023 Multi-Task Imitation Learning for Linear Dynamical Systems L4DC 2023 Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware RSS 2023 Pre-Training for Robots: Offline RL Enables Learning New Tasks in a Handful of Trials RSS 2023 Deep RL at Scale: Sorting Waste in Office Buildings with a Fleet of Mobile Manipulators RSS 2023 RT-1: Robotics Transformer for Real-World Control at Scale RSS 2023 Robotic Skill Acquisition via Instruction Augmentation with Vision-Language Models RSS 2023 Learning and Adapting Agile Locomotion Skills by Transferring Experience RSS 2023 Robust and Versatile Bipedal Jumping Control through Reinforcement Learning RSS 2023 Demonstrating A Walk in the Park: Learning to Walk in 20 Minutes With Model-Free Reinforcement Learning RSS 2023 Maximum Entropy RL (Provably) Solves Some Robust RL Problems ICLR 2022 MEMO: Test Time Robustness via Adaptation and Augmentation NIPS 2022 ViKiNG: Vision-Based Kilometer-Scale Navigation with Geographic Hints RSS 2022 Bridge Data: Boosting Generalization of Robotic Skills with Cross-Domain Datasets RSS 2022 Contrastive Learning as Goal-Conditioned Reinforcement Learning NIPS 2022 Object Representations as Fixed Points: Training Iterative Refinement Algorithms with Implicit Differentiation NIPS 2022 First Contact: Unsupervised Human-Machine Co-Adaptation via Mutual Information Maximization NIPS 2022 Generalization with Lossy Affordances: Leveraging Broad Offline Data for Learning Visuomotor Tasks CORL 2022 Distributionally Adaptive Meta Reinforcement Learning NIPS 2022 Mismatched No More: Joint Model-Policy Optimization for Model-Based RL NIPS 2022 Adversarial Unlearning: Reducing Confidence Along Adversarial Directions NIPS 2022 Unpacking Reward Shaping: Understanding the Benefits of Reward Engineering on Sample Complexity NIPS 2022 You Only Live Once: Single-Life Reinforcement Learning NIPS 2022 Data-Driven Offline Decision-Making via Invariant Representation Learning NIPS 2022 Is Anyone There? Learning a Planner Contingent on Perceptual Uncertainty CORL 2022 Don’t Start From Scratch: Leveraging Prior Data to Automate Robotic Reinforcement Learning CORL 2022 Inner Monologue: Embodied Reasoning through Planning with Language Models CORL 2022 GenLoco: Generalized Locomotion Controllers for Quadrupedal Robots CORL 2022 Context-Aware Language Modeling for Goal-Oriented Dialogue Systems NAACL 2022 Offline Reinforcement Learning for Visual Navigation CORL 2022 Imitating Past Successes can be Very Suboptimal NIPS 2022 TRAIL: Near-Optimal Imitation Learning with Suboptimal Data ICLR 2022 Autonomous Reinforcement Learning: Formalism and Benchmarking ICLR 2022 Information Prioritization through Empowerment in Visual Model-based RL ICLR 2022 RvS: What is Essential for Offline RL via Supervised Learning? ICLR 2022 C-Planning: An Automatic Curriculum for Learning Goal-Reaching Tasks ICLR 2022 CHAI: A CHatbot AI for Task-Oriented Dialogue with Offline Reinforcement Learning NAACL 2022 How to Leverage Unlabeled Data in Offline Reinforcement Learning ICML 2022 Design-Bench: Benchmarks for Data-Driven Offline Model-Based Optimization ICML 2022 Do As I Can, Not As I Say: Grounding Language in Robotic Affordances CORL 2022 LM-Nav: Robotic Navigation with Large Pre-Trained Models of Language, Vision, and Action CORL 2022 Offline Meta-Reinforcement Learning with Online Self-Supervision ICML 2022 Lyapunov Density Models: Constraining Distribution Shift in Learning-Based Control ICML 2022 Should I Run Offline Reinforcement Learning or Behavioral Cloning? ICLR 2022 Planning with Diffusion for Flexible Behavior Synthesis ICML 2022 Bisimulation Makes Analogies in Goal-Conditioned Reinforcement Learning ICML 2022 Offline RL Policies Should Be Trained to be Adaptive ICML 2022 Extending the WILDS Benchmark for Unsupervised Adaptation ICLR 2022 The Information Geometry of Unsupervised Reinforcement Learning ICLR 2022 DR3: Value-Based Deep Reinforcement Learning Requires Explicit Regularization ICLR 2022 Offline Reinforcement Learning with Implicit Q-Learning ICLR 2022 CoMPS: Continual Meta Policy Search ICLR 2022 DASCO: Dual-Generator Adversarial Support Constrained Offline Reinforcement Learning NIPS 2022 Value Function Spaces: Skill-Centric State Abstractions for Long-Horizon Reasoning ICLR 2022 Data-Driven Offline Optimization for Architecting Hardware Accelerators ICLR 2022 Learning Invariant Representations for Reinforcement Learning without Reconstruction ICLR 2021 SMiRL: Surprise Minimizing Reinforcement Learning in Unstable Environments ICLR 2021 Learning to Reach Goals via Iterated Supervised Learning ICLR 2021 Recurrent Independent Mechanisms ICLR 2021 Model-Based Visual Planning with Self-Supervised Functional Distances ICLR 2021 Implicit Under-Parameterization Inhibits Data-Efficient Deep Reinforcement Learning ICLR 2021 Autonomous Reinforcement Learning via Subgoal Curricula NIPS 2021 Fully Autonomous Real-World Reinforcement Learning with Applications to Mobile Manipulation CORL 2021 Adaptive Risk Minimization: Learning to Adapt to Domain Shift NIPS 2021 COMBO: Conservative Offline Model-Based Policy Optimization NIPS 2021 Robust Predictable Control NIPS 2021 Pragmatic Image Compression for Human-in-the-Loop Decision-Making NIPS 2021 Which Mutual-Information Representation Learning Objectives are Sufficient for Control? NIPS 2021 Why Generalization in RL is Difficult: Epistemic POMDPs and Implicit Partial Observability NIPS 2021 Amortized Conditional Normalized Maximum Likelihood: Reliable Out of Distribution Uncertainty Estimation ICML 2021 Conservative Objective Models for Effective Offline Model-Based Optimization ICML 2021 Model-Based Reinforcement Learning via Latent-Space Collocation ICML 2021 Simple and Effective VAE Training with Calibrated Decoders ICML 2021 Emergent Social Learning via Multi-agent Reinforcement Learning ICML 2021 Offline Meta-Reinforcement Learning with Advantage Weighting ICML 2021 MURAL: Meta-Learning Uncertainty-Aware Rewards for Outcome-Driven Reinforcement Learning ICML 2021 WILDS: A Benchmark of in-the-Wild Distribution Shifts ICML 2021 Policy Information Capacity: Information-Theoretic Measure for Task Complexity in Deep Reinforcement Learning ICML 2021 PsiPhi-Learning: Reinforcement Learning with Demonstrations using Successor Features and Inverse Temporal Difference Learning ICML 2021 Variational Empowerment as Representation Learning for Goal-Conditioned Reinforcement Learning ICML 2021 Actionable Models: Unsupervised Offline Reinforcement Learning of Robotic Skills ICML 2021 Modularity in Reinforcement Learning via Algorithmic Independence in Credit Assignment ICML 2021 Bayesian Adaptation for Covariate Shift NIPS 2021 Offline Reinforcement Learning as One Big Sequence Modeling Problem NIPS 2021 Information is Power: Intrinsic Control via Information Capture NIPS 2021 Conservative Data Sharing for Multi-Task Offline Reinforcement Learning NIPS 2021 Replacing Rewards with Examples: Example-Based Policy Search via Recursive Classification NIPS 2021 Outcome-Driven Reinforcement Learning via Variational Inference NIPS 2021 Understanding the World Through Action CORL 2021 Hierarchically Integrated Models: Learning to Navigate from Heterogeneous Robots CORL 2021 AW-Opt: Learning Robotic Skills with Imitation andReinforcement at Scale CORL 2021 BC-Z: Zero-Shot Task Generalization with Robotic Imitation Learning CORL 2021 Rapid Exploration for Open-World Navigation with Latent Goal Models CORL 2021 Scaling Up Multi-Task Robotic Reinforcement Learning CORL 2021 A Workflow for Offline Model-Free Robotic Reinforcement Learning CORL 2021 Benchmarks for Deep Off-Policy Evaluation ICLR 2021 Factorizing Declarative and Procedural Knowledge in Structured, Dynamical Environments ICLR 2021 Offline Model-Based Optimization via Normalized Maximum Likelihood Estimation ICLR 2021 X2T: Training an X-to-Text Typing Interface with Online Learning from User Feedback ICLR 2021 C-Learning: Learning to Achieve Goals via Recursive Classification ICLR 2021 Off-Dynamics Reinforcement Learning: Training for Transfer with Domain Classifiers ICLR 2021 Conservative Safety Critics for Exploration ICLR 2021 OPAL: Offline Primitive Discovery for Accelerating Offline Reinforcement Learning ICLR 2021 Evolving Reinforcement Learning Algorithms ICLR 2021 Parrot: Data-Driven Behavioral Priors for Reinforcement Learning ICLR 2021 Decentralized Reinforcement Learning: Global Decision-Making via Local Economic Transactions ICML 2020 RL-CycleGAN: Reinforcement Learning Aware Simulation-to-Real CVPR 2020 Learning Predictive Models from Observation and Interaction ECCV 2020 The Ingredients of Real World Robotic Reinforcement Learning ICLR 2020 Model Based Reinforcement Learning for Atari ICLR 2020 Thinking While Moving: Deep Reinforcement Learning with Concurrent Control ICLR 2020 Deep Imitative Models for Flexible Inference, Planning, and Control ICLR 2020 Dynamics-Aware Unsupervised Discovery of Skills ICLR 2020 Dynamical Distance Learning for Semi-Supervised and Unsupervised Skill Discovery ICLR 2020 Meta-Learning without Memorization ICLR 2020 VideoFlow: A Conditional Flow-Based Model for Stochastic Video Generation ICLR 2020 Adversarial Policies: Attacking Deep Reinforcement Learning ICLR 2020 Watch, Try, Learn: Meta-Learning from Demonstrations and Rewards ICLR 2020 Reinforcement Learning with Competitive Ensembles of Information-Constrained Primitives ICLR 2020 The Variational Bandwidth Bottleneck: Stochastic Evaluation on an Information Budget ICLR 2020 SQIL: Imitation Learning via Reinforcement Learning with Sparse Rewards ICLR 2020 AVID: Learning Multi-Stage Tasks via Pixel-Level Translation of Human Videos RSS 2020 Emergent Real-World Robotic Skills via Unsupervised Off-Policy Reinforcement Learning RSS 2020 Learning Agile Robotic Locomotion Skills by Imitating Animals RSS 2020 DisCor: Corrective Feedback in Reinforcement Learning via Distribution Correction NIPS 2020 Long-Horizon Visual Planning with Goal-Conditioned Hierarchical Predictors NIPS 2020 Rewriting History with Inverse RL: Hindsight Inference for Policy Improvement NIPS 2020 MOPO: Model-based Offline Policy Optimization NIPS 2020 Emergent Complexity and Zero-shot Transfer via Unsupervised Environment Design NIPS 2020 One Solution is Not All You Need: Few-Shot Extrapolation via Structured MaxEnt RL NIPS 2020 Gradient Surgery for Multi-Task Learning NIPS 2020 Model Inversion Networks for Model-Based Optimization NIPS 2020 Continual Learning of Control Primitives : Skill Discovery via Reset-Games NIPS 2020 Gamma-Models: Generative Temporal Difference Learning for Infinite-Horizon Prediction NIPS 2020 Can Autonomous Vehicles Identify, Recover From, and Adapt to Distribution Shifts? ICML 2020 Skew-Fit: State-Covering Self-Supervised Reinforcement Learning ICML 2020 Learning Human Objectives by Evaluating Hypothetical Behavior ICML 2020 Cautious Adaptation For Reinforcement Learning in Safety-Critical Settings ICML 2020 Conservative Q-Learning for Offline Reinforcement Learning NIPS 2020 Stochastic Latent Actor-Critic: Deep Reinforcement Learning with a Latent Variable Model NIPS 2020 Inverting the Pose Forecasting Pipeline with SPF2: Sequential Pointcloud Forecasting for Sequential Pose Forecasting CORL 2020 Reinforcement Learning with Videos: Combining Offline Observations with Interaction CORL 2020 Assisted Perception: Optimizing Observations to Communicate State CORL 2020 Learning to Walk in the Real World with Minimal Human Effort CORL 2020 MELD: Meta-Reinforcement Learning from Images via Latent State Models CORL 2020 Never Stop Learning: The Effectiveness of Fine-Tuning in Robotic Reinforcement Learning CORL 2020 Chaining Behaviors from Data with Model-Free Reinforcement Learning CORL 2020 Learning to Adapt in Dynamic, Real-World Environments through Meta-Reinforcement Learning ICLR 2019 Near-Optimal Representation Learning for Hierarchical Reinforcement Learning ICLR 2019 End-To-End Robotic Reinforcement Learning without Reward Engineering RSS 2019 Contextual Imagined Goals for Self-Supervised Robotic Learning CORL 2019 Learning to Walk Via Deep Reinforcement Learning RSS 2019 Wasserstein Dependency Measure for Representation Learning NIPS 2019 Unsupervised Curricula for Visual Meta-Reinforcement Learning NIPS 2019 Guided Meta-Policy Search NIPS 2019 Planning with Goal-Conditioned Policies NIPS 2019 Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction NIPS 2019 Off-Policy Evaluation via Off-Policy Classification NIPS 2019 MCP: Learning Composable Hierarchical Control with Multiplicative Compositional Policies NIPS 2019 Causal Confusion in Imitation Learning NIPS 2019 When to Trust Your Model: Model-Based Policy Optimization NIPS 2019 Search on the Replay Buffer: Bridging Planning and Reinforcement Learning NIPS 2019 Meta-Learning with Implicit Gradients NIPS 2019 Compositional Plan Vectors NIPS 2019 Improvisation through Physical Understanding: Using Novel Objects As Tools with Visual Foresight RSS 2019 Learning Latent Plans from Play CORL 2019 Deep Dynamics Models for Learning Dexterous Manipulation CORL 2019 Entity Abstraction in Visual Model-Based Reinforcement Learning CORL 2019 Variational Discriminator Bottleneck: Improving Imitation Learning, Inverse RL, and GANs by Constraining Information Flow ICLR 2019 Learning Actionable Representations with Goal Conditioned Policies ICLR 2019 Time-Agnostic Prediction: Predicting Predictable Video Frames ICLR 2019 Recall Traces: Backtracking Models for Efficient Reinforcement Learning ICLR 2019 Deep Online Learning Via Meta-Learning: Continual Adaptation for Model-Based RL ICLR 2019 Automatically Composing Representation Transformations as a Means for Generalization ICLR 2019 Guiding Policies with Language via Meta-Learning ICLR 2019 Reasoning About Physical Interactions with Object-Oriented Prediction and Planning ICLR 2019 ROBEL: Robotics Benchmarks for Learning with Low-Cost Robots CORL 2019 Online Meta-Learning ICML 2019 Diagnosing Bottlenecks in Deep Q-learning Algorithms ICML 2019 EMI: Exploration with Mutual Information ICML 2019 Efficient Off-Policy Meta-Reinforcement Learning via Probabilistic Context Variables ICML 2019 Learning a Prior over Intent via Meta-Inverse Reinforcement Learning ICML 2019 SOLAR: Deep Structured Representations for Model-Based Reinforcement Learning ICML 2019 PRECOG: PREdiction Conditioned on Goals in Visual Multi-Agent Settings ICCV 2019 Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning CORL 2019 Relay Policy Learning: Solving Long-Horizon Tasks via Imitation and Reinforcement Learning CORL 2019 Sim-To-Real via Sim-To-Sim: Data-Efficient Robotic Grasping via Randomized-To-Canonical Adaptation Networks CVPR 2019 From Language to Goals: Inverse Reinforcement Learning for Vision-Based Instruction Following ICLR 2019 RoboNet: Large-Scale Multi-Robot Learning CORL 2019 InfoBot: Transfer and Exploration via the Information Bottleneck ICLR 2019 Discriminator-Actor-Critic: Addressing Sample Inefficiency and Reward Bias in Adversarial Imitation Learning ICLR 2019 Unsupervised Learning via Meta-Learning ICLR 2019 Diversity is All You Need: Learning Skills without a Reward Function ICLR 2019 Scalable Deep Reinforcement Learning for Vision-Based Robotic Manipulation CORL 2018 Self-Consistent Trajectory Autoencoder: Hierarchical Reinforcement Learning with Trajectory Embeddings ICML 2018 Latent Space Policies for Hierarchical Reinforcement Learning ICML 2018 Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor ICML 2018 Regret Minimization for Partially Observable Deep Reinforcement Learning ICML 2018 Universal Planning Networks: Learning Generalizable Representations for Visuomotor Control ICML 2018 The Mirage of Action-Dependent Baselines in Reinforcement Learning ICML 2018 One-Shot Imitation from Observing Humans via Domain-Adaptive Meta-Learning RSS 2018 Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models NIPS 2018 Divide-and-Conquer Reinforcement Learning ICLR 2018 Temporal Difference Models: Model-Free Deep RL for Model-Based Control ICLR 2018 Learning Robust Rewards with Adverserial Inverse Reinforcement Learning ICLR 2018 Recasting Gradient-Based Meta-Learning as Hierarchical Bayes ICLR 2018 Leave no Trace: Learning to Reset for Safe and Autonomous Reinforcement Learning ICLR 2018 Stochastic Variational Video Prediction ICLR 2018 Meta-Learning and Universality: Deep Representations and Gradient Descent can Approximate any Learning Algorithm ICLR 2018 Sim2Real Viewpoint Invariant Visual Servoing by Recurrent Control CVPR 2018 Data-Efficient Hierarchical Reinforcement Learning NIPS 2018 Variational Inverse Control with Events: A General Framework for Data-Driven Reward Definition NIPS 2018 Probabilistic Model-Agnostic Meta-Learning NIPS 2018 Visual Reinforcement Learning with Imagined Goals NIPS 2018 Where Do You Think You're Going?: Inferring Beliefs about Dynamics from Behavior NIPS 2018 Visual Memory for Robust Path Following NIPS 2018 Meta-Reinforcement Learning of Structured Exploration Strategies NIPS 2018 Learning with Latent Language NAACL 2018 Few-Shot Goal Inference for Visuomotor Learning and Planning CORL 2018 Grasp2Vec: Learning Object Representations from Self-Supervised Grasping CORL 2018 Composable Action-Conditioned Predictors: Flexible Off-Policy Learning for Robot Navigation CORL 2018 Robustness via Retrying: Closed-Loop Robotic Manipulation with Self-Supervised Learning CORL 2018 Learning Complex Dexterous Manipulation with Deep Reinforcement Learning and Demonstrations RSS 2018 Shared Autonomy via Deep Reinforcement Learning RSS 2018 The Feeling of Success: Does Touch Sensing Help Predict Grasp Outcomes? CORL 2017 Self-Supervised Visual Planning with Temporal Skip Connections CORL 2017 One-Shot Visual Imitation Learning via Meta-Learning CORL 2017 Modular Multitask Reinforcement Learning with Policy Sketches ICML 2017 End-to-End Learning of Semantic Grasping CORL 2017 GPLAC: Generalizing Vision-Based Robotic Skills Using Weakly Labeled Images ICCV 2017 CAD2RL: Real Single-Image Flight Without a Single Real Image RSS 2017 Unsupervised Perceptual Rewards for Imitation Learning RSS 2017 Reinforcement Learning with Deep Energy-Based Policies ICML 2017 Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks ICML 2017 Combining Model-Based and Model-Free Updates for Trajectory-Centric Reinforcement Learning ICML 2017 Value Iteration Networks IJCAI 2017 Interpolated Policy Gradient: Merging On-Policy and Off-Policy Gradient Estimation for Deep Reinforcement Learning NIPS 2017 EX2: Exploration with Exemplar Models for Deep Reinforcement Learning NIPS 2017 Cognitive Mapping and Planning for Visual Navigation CVPR 2017 Learning Robotic Manipulation of Granular Media CORL 2017 Learning to Poke by Poking: Experiential Learning of Intuitive Physics NIPS 2016 Unsupervised Learning for Physical Interaction through Video Prediction NIPS 2016 Continuous Deep Q-Learning with Model-based Acceleration ICML 2016 Guided Cost Learning: Deep Inverse Optimal Control via Policy Optimization ICML 2016 End-to-End Training of Deep Visuomotor Policies JMLR 2016 Backprop KF: Learning Discriminative Deterministic State Estimators NIPS 2016 Guided Policy Search via Approximate Mirror Descent NIPS 2016 Value Iteration Networks NIPS 2016 Recurrent Network Models for Human Dynamics ICCV 2015 Trust Region Policy Optimization ICML 2015 Learning Neural Network Policies with Guided Policy Search under Unknown Dynamics NIPS 2014 Learning Complex Neural Network Policies with Trajectory Optimization ICML 2014 Guided Policy Search ICML 2013 Variational Policy Search via Trajectory Optimization NIPS 2013 Nonlinear Inverse Reinforcement Learning with Gaussian Processes NIPS 2011 Feature Construction for Inverse Reinforcement Learning NIPS 2010