Sergey Levine
362 papers · 2010–2026 · 14 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+21 more ↓ Show less ↑
π§ Keyword Pioneer πΊοΈ Taxonomy Completionist (44) π Renaissance Researcher (7) π Interdisciplinary Bridge π£ Hot Topic Early Bird
π§
Keyword Pioneer
π
Renaissance Researcher
(7)
πΊοΈ
Taxonomy Completionist
(44)
π
Keyword Trendsetter Combo
(15)
π
Conference Loyalist
(82)
π
Domain Dominant
(52)
π§¬
Topic Evolution
π¬
Deep Specialist
(20)
π±
Topic Pioneer
π
Keyword Champion
(8)
π€
Dynamic Duo
(90)
π
Triple Crown
π₯
Mega-Team
(98)
π
Grand Slam
β‘
Prolific Year
(46)
β
The Questioner
(9)
π₯
Unstoppable
(13)
ποΈ
Keyword Collector
(213)
π
Conference Pioneer
π
Century Club
(361)
π
Trend Setter
Conferences
ICLR (90)
NIPS (82)
ICML (75)
CORL (67)
RSS (30)
CVPR (4)
NAACL (4)
ICCV (3)
L4DC (2)
AAAI (1)
AISTATS (1)
ECCV (1)
IJCAI (1)
JMLR (1)
Top co-authors
Research topics
Keywords
reinforcement learning
(54)
offline reinforcement learning
(30)
deep reinforcement learning
(26)
imitation learning
(26)
representation learning
(24)
robotic manipulation
(22)
model-based reinforcement learning
(16)
sample efficiency
(16)
multi-task learning
(12)
distribution shift
(12)
policy learning
(12)
continuous control
(11)
off-policy learning
(11)
self-supervised learning
(10)
policy optimization
(10)
robot manipulation
(10)
goal-conditioned policy
(10)
few-shot learning
(9)
reward function
(9)
value function
(9)
Papers
Cliqueformer: Model-Based Optimization with Structured Transformers
AAAI 2026
Proposer-Agent-Evaluator (PAE): Autonomous Skill Discovery For Foundation Model Internet Agents
ICML 2025
Leveraging Skills from Unlabeled Prior Data for Efficient Online Exploration
ICML 2025
Behavioral Exploration: Learning to Explore via In-Context Adaptation
ICML 2025
Reward-Guided Iterative Refinement in Diffusion Models at Test-Time with Applications to Protein and DNA Design
ICML 2025
Hi Robot: Open-Ended Instruction Following with Hierarchical Vision-Language-Action Models
ICML 2025
Scaling Test-Time Compute Without Verification or RL is Suboptimal
ICML 2025
Value-Based Deep RL Scales Predictably
ICML 2025
Reflective Planning: Vision-Language Models for Multi-Stage Long-Horizon Robotic Manipulation
CORL 2025
AutoEval: Autonomous Evaluation of Generalist Robot Manipulation Policies in the Real World
CORL 2025
Training Strategies for Efficient Embodied Reasoning
CORL 2025
RoboArena: Distributed Real-World Evaluation of Generalist Robot Policies
CORL 2025
Steering Your Diffusion Policy with Latent Space Reinforcement Learning
CORL 2025
$\pi_0.5$: a Vision-Language-Action Model with Open-World Generalization
CORL 2025
Flow Q-Learning
ICML 2025
What Do Learning Dynamics Reveal About Generalization in LLM Mathematical Reasoning?
ICML 2025
Adding Conditional Control to Diffusion Models with Reinforcement Learning
ICLR 2025
Prioritized Generative Replay
ICLR 2025
Digi-Q: Learning VLM Q-Value Functions for Training Device-Control Agents
ICLR 2025
OGBench: Benchmarking Offline Goal-Conditioned RL
ICLR 2025
One Step Diffusion via Shortcut Models
ICLR 2025
Language Guided Skill Discovery
ICLR 2025
Fine-Tuning Discrete Diffusion Models via Reward Optimization with Applications to DNA and Protein Design
ICLR 2025
Q-SFT: Q-Learning for Language Models via Supervised Fine-Tuning
ICLR 2025
RLDG: Robotic Generalist Policy Distillation via Reinforcement Learning
RSS 2025
FAST: Efficient Action Tokenization for Vision-Language-Action Models
RSS 2025
Οβ: A Vision-Language-Action Flow Model for General Robot Control
RSS 2025
Efficient Online Reinforcement Learning Fine-Tuning Need Not Retain Offline Data
ICLR 2025
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training
ICML 2025
Unfamiliar Finetuning Examples Control How Language Models Hallucinate
NAACL 2025
LMRL Gym: Benchmarks for Multi-Turn Reinforcement Learning with Language Models
ICML 2025
Steering Your Generalists: Improving Robotic Foundation Models via Value Guidance
CORL 2024
Autonomous Improvement of Instruction Following Skills via Foundation Models
CORL 2024
Mobility VLA: Multimodal Instruction Navigation with Long-Context VLMs and Topological Graphs
CORL 2024
Evaluating Real-World Robot Manipulation Policies in Simulation
CORL 2024
Robotic Control via Embodied Chain-of-Thought Reasoning
CORL 2024
OpenVLA: An Open-Source Vision-Language-Action Model
CORL 2024
Policy Adaptation via Language Optimization: Decomposing Tasks for Few-Shot Imitation
CORL 2024
Lifelong Autonomous Improvement of Navigation Foundation Models in the Wild
CORL 2024
LeLaN: Learning A Language-Conditioned Navigation Policy from In-the-Wild Video
CORL 2024
Scaling Cross-Embodied Learning: One Policy for Manipulation, Navigation, Locomotion and Aviation
CORL 2024
DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset
RSS 2024
Pushing the Limits of Cross-Embodiment Learning for Manipulation and Navigation
RSS 2024
DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement Learning
NIPS 2024
Octo: An Open-Source Generalist Robot Policy
RSS 2024
Inference via Interpolation: Contrastive Representations Provably Enable Planning and Inference
NIPS 2024
Learning to Assist Humans without Inferring Rewards
NIPS 2024
Is Value Learning Really the Main Bottleneck in Offline RL?
NIPS 2024
Designing Cell-Type-Specific Promoter Sequences Using Conservative Model-Based Optimization
NIPS 2024
Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learning
NIPS 2024
Bridging Model-Based Optimization and Generative Modeling via Conservative Fine-Tuning of Diffusion Models
NIPS 2024
SELFI: Autonomous Self-Improvement with RL for Vision-Based Navigation around People
CORL 2024
RACER: Epistemic Risk-Sensitive RL Enables Fast Driving with Fewer Crashes
RSS 2024
MOKA: Open-World Robotic Manipulation through Mark-Based Visual Prompting
RSS 2024
Yell At Your Robot: Improving On-the-Fly from Language Corrections
RSS 2024
ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL
ICML 2024
Learning to Explore in POMDPs with Informational Rewards
ICML 2024
Feedback Efficient Online Fine-Tuning of Diffusion Models
ICML 2024
Prompting is a Double-Edged Sword: Improving Worst-Group Robustness of Foundation Models
ICML 2024
Foundation Policies with Hilbert Representations
ICML 2024
PIVOT: Iterative Visual Prompting Elicits Actionable Knowledge for VLMs
ICML 2024
Learning Temporal Distances: Contrastive Successor Features Can Provide a Metric Structure for Decision-Making
ICML 2024
Chain of Code: Reasoning with a Language Model-Augmented Code Emulator
ICML 2024
Unsupervised Zero-Shot Reinforcement Learning via Functional Reward Encodings
ICML 2024
Stop Regressing: Training Value Functions via Classification for Scalable Deep RL
ICML 2024
Project and Probe: Sample-Efficient Adaptation by Interpolating Orthogonal Features
ICLR 2024
Deep Neural Networks Tend To Extrapolate Predictably
ICLR 2024
METRA: Scalable Unsupervised RL with Metric-Aware Abstraction
ICLR 2024
The False Promise of Imitating Proprietary Language Models
ICLR 2024
RLIF: Interactive Imitation Learning as Reinforcement Learning
ICLR 2024
Stabilizing Contrastive RL: Techniques for Robotic Goal Reaching from Offline Data
ICLR 2024
Offline RL with Observation Histories: Analyzing and Improving Sample Complexity
ICLR 2024
Zero-Shot Robotic Manipulation with Pre-Trained Image-Editing Diffusion Models
ICLR 2024
Training Diffusion Models with Reinforcement Learning
ICLR 2024
Functional Graphical Models: Structure Enables Offline Data-Driven Optimization
AISTATS 2024
Offline RL for Natural Language Generation with Implicit Language Q Learning
ICLR 2023
ReDS: Offline RL With Heteroskedastic Datasets via Support Constraints
NIPS 2023
HIQL: Offline Goal-Conditioned RL with Latent States as Actions
NIPS 2023
Learning to Influence Human Behavior with Offline Reinforcement Learning
NIPS 2023
Ignorance is Bliss: Robust Control via Information Gating
NIPS 2023
Grounded Decoding: Guiding Text Generation with Grounded Models for Embodied Agents
NIPS 2023
Cal-QL: Calibrated Offline RL Pre-Training for Efficient Online Fine-Tuning
NIPS 2023
Accelerating Exploration with Unlabeled Prior Data
NIPS 2023
ViNT: A Foundation Model for Visual Navigation
CORL 2023
Action-Quantized Offline Reinforcement Learning for Robotic Skill Learning
CORL 2023
BridgeData V2: A Dataset for Robot Learning at Scale
CORL 2023
REBOOT: Reuse Data for Bootstrapping Efficient Real-World Dexterous Manipulation
CORL 2023
RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control
CORL 2023
Navigation with Large Language Models: Semantic Guesswork as a Heuristic for Planning
CORL 2023
FastRLAP: A System for Learning High-Speed Driving via Deep RL and Autonomous Practicing
CORL 2023
Goal Representations for Instruction Following: A Semi-Supervised Language Interface to Control
CORL 2023
Q-Transformer: Scalable Offline Reinforcement Learning via Autoregressive Q-Functions
CORL 2023
Hierarchical Abstraction for Combinatorial Generalization in Object Rearrangement
ICLR 2023
Bitrate-Constrained DRO: Beyond Worst Case Robustness To Unknown Group Shifts
ICLR 2023
Efficient Deep Reinforcement Learning Requires Regulating Overfitting
ICLR 2023
Simplifying Model-based RL: Learning Representations, Latent-space Models, and Policies with One Objective
ICLR 2023
Offline Q-learning on Diverse Multi-Task Data Both Scales And Generalizes
ICLR 2023
Confidence-Conditioned Value Functions for Offline Reinforcement Learning
ICLR 2023
Efficient Online Reinforcement Learning with Offline Data
ICML 2023
PaLM-E: An Embodied Multimodal Language Model
ICML 2023
A Connection between One-Step RL and Critic Regularization in Reinforcement Learning
ICML 2023
Reinforcement Learning from Passive Data via Latent Intentions
ICML 2023
Understanding the Complexity Gains of Single-Task RL with a Curriculum
ICML 2023
Predictable MDP Abstraction for Unsupervised Model-Based RL
ICML 2023
Jump-Start Reinforcement Learning
ICML 2023
Adversarial Policies Beat Superhuman Go AIs
ICML 2023
Contrastive Example-Based Control
L4DC 2023
Multi-Task Imitation Learning for Linear Dynamical Systems
L4DC 2023
Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware
RSS 2023
Pre-Training for Robots: Offline RL Enables Learning New Tasks in a Handful of Trials
RSS 2023
Deep RL at Scale: Sorting Waste in Office Buildings with a Fleet of Mobile Manipulators
RSS 2023
RT-1: Robotics Transformer for Real-World Control at Scale
RSS 2023
Robotic Skill Acquisition via Instruction Augmentation with Vision-Language Models
RSS 2023
Learning and Adapting Agile Locomotion Skills by Transferring Experience
RSS 2023
Robust and Versatile Bipedal Jumping Control through Reinforcement Learning
RSS 2023
Demonstrating A Walk in the Park: Learning to Walk in 20 Minutes With Model-Free Reinforcement Learning
RSS 2023
Maximum Entropy RL (Provably) Solves Some Robust RL Problems
ICLR 2022
MEMO: Test Time Robustness via Adaptation and Augmentation
NIPS 2022
ViKiNG: Vision-Based Kilometer-Scale Navigation with Geographic Hints
RSS 2022
Bridge Data: Boosting Generalization of Robotic Skills with Cross-Domain Datasets
RSS 2022
Contrastive Learning as Goal-Conditioned Reinforcement Learning
NIPS 2022
Object Representations as Fixed Points: Training Iterative Refinement Algorithms with Implicit Differentiation
NIPS 2022
First Contact: Unsupervised Human-Machine Co-Adaptation via Mutual Information Maximization
NIPS 2022
Generalization with Lossy Affordances: Leveraging Broad Offline Data for Learning Visuomotor Tasks
CORL 2022
Distributionally Adaptive Meta Reinforcement Learning
NIPS 2022
Mismatched No More: Joint Model-Policy Optimization for Model-Based RL
NIPS 2022
Adversarial Unlearning: Reducing Confidence Along Adversarial Directions
NIPS 2022
Unpacking Reward Shaping: Understanding the Benefits of Reward Engineering on Sample Complexity
NIPS 2022
You Only Live Once: Single-Life Reinforcement Learning
NIPS 2022
Data-Driven Offline Decision-Making via Invariant Representation Learning
NIPS 2022
Is Anyone There? Learning a Planner Contingent on Perceptual Uncertainty
CORL 2022
Donβt Start From Scratch: Leveraging Prior Data to Automate Robotic Reinforcement Learning
CORL 2022
Inner Monologue: Embodied Reasoning through Planning with Language Models
CORL 2022
GenLoco: Generalized Locomotion Controllers for Quadrupedal Robots
CORL 2022
Context-Aware Language Modeling for Goal-Oriented Dialogue Systems
NAACL 2022
Offline Reinforcement Learning for Visual Navigation
CORL 2022
Imitating Past Successes can be Very Suboptimal
NIPS 2022
TRAIL: Near-Optimal Imitation Learning with Suboptimal Data
ICLR 2022
Autonomous Reinforcement Learning: Formalism and Benchmarking
ICLR 2022
Information Prioritization through Empowerment in Visual Model-based RL
ICLR 2022
RvS: What is Essential for Offline RL via Supervised Learning?
ICLR 2022
C-Planning: An Automatic Curriculum for Learning Goal-Reaching Tasks
ICLR 2022
CHAI: A CHatbot AI for Task-Oriented Dialogue with Offline Reinforcement Learning
NAACL 2022
How to Leverage Unlabeled Data in Offline Reinforcement Learning
ICML 2022
Design-Bench: Benchmarks for Data-Driven Offline Model-Based Optimization
ICML 2022
Do As I Can, Not As I Say: Grounding Language in Robotic Affordances
CORL 2022
LM-Nav: Robotic Navigation with Large Pre-Trained Models of Language, Vision, and Action
CORL 2022
Offline Meta-Reinforcement Learning with Online Self-Supervision
ICML 2022
Lyapunov Density Models: Constraining Distribution Shift in Learning-Based Control
ICML 2022
Should I Run Offline Reinforcement Learning or Behavioral Cloning?
ICLR 2022
Planning with Diffusion for Flexible Behavior Synthesis
ICML 2022
Bisimulation Makes Analogies in Goal-Conditioned Reinforcement Learning
ICML 2022
Offline RL Policies Should Be Trained to be Adaptive
ICML 2022
Extending the WILDS Benchmark for Unsupervised Adaptation
ICLR 2022
The Information Geometry of Unsupervised Reinforcement Learning
ICLR 2022
DR3: Value-Based Deep Reinforcement Learning Requires Explicit Regularization
ICLR 2022
Offline Reinforcement Learning with Implicit Q-Learning
ICLR 2022
CoMPS: Continual Meta Policy Search
ICLR 2022
DASCO: Dual-Generator Adversarial Support Constrained Offline Reinforcement Learning
NIPS 2022
Value Function Spaces: Skill-Centric State Abstractions for Long-Horizon Reasoning
ICLR 2022
Data-Driven Offline Optimization for Architecting Hardware Accelerators
ICLR 2022
Learning Invariant Representations for Reinforcement Learning without Reconstruction
ICLR 2021
SMiRL: Surprise Minimizing Reinforcement Learning in Unstable Environments
ICLR 2021
Learning to Reach Goals via Iterated Supervised Learning
ICLR 2021
Recurrent Independent Mechanisms
ICLR 2021
Model-Based Visual Planning with Self-Supervised Functional Distances
ICLR 2021
Implicit Under-Parameterization Inhibits Data-Efficient Deep Reinforcement Learning
ICLR 2021
Autonomous Reinforcement Learning via Subgoal Curricula
NIPS 2021
Fully Autonomous Real-World Reinforcement Learning with Applications to Mobile Manipulation
CORL 2021
Adaptive Risk Minimization: Learning to Adapt to Domain Shift
NIPS 2021
COMBO: Conservative Offline Model-Based Policy Optimization
NIPS 2021
Robust Predictable Control
NIPS 2021
Pragmatic Image Compression for Human-in-the-Loop Decision-Making
NIPS 2021
Which Mutual-Information Representation Learning Objectives are Sufficient for Control?
NIPS 2021
Why Generalization in RL is Difficult: Epistemic POMDPs and Implicit Partial Observability
NIPS 2021
Amortized Conditional Normalized Maximum Likelihood: Reliable Out of Distribution Uncertainty Estimation
ICML 2021
Conservative Objective Models for Effective Offline Model-Based Optimization
ICML 2021
Model-Based Reinforcement Learning via Latent-Space Collocation
ICML 2021
Simple and Effective VAE Training with Calibrated Decoders
ICML 2021
Emergent Social Learning via Multi-agent Reinforcement Learning
ICML 2021
Offline Meta-Reinforcement Learning with Advantage Weighting
ICML 2021
MURAL: Meta-Learning Uncertainty-Aware Rewards for Outcome-Driven Reinforcement Learning
ICML 2021
WILDS: A Benchmark of in-the-Wild Distribution Shifts
ICML 2021
Policy Information Capacity: Information-Theoretic Measure for Task Complexity in Deep Reinforcement Learning
ICML 2021
PsiPhi-Learning: Reinforcement Learning with Demonstrations using Successor Features and Inverse Temporal Difference Learning
ICML 2021
Variational Empowerment as Representation Learning for Goal-Conditioned Reinforcement Learning
ICML 2021
Actionable Models: Unsupervised Offline Reinforcement Learning of Robotic Skills
ICML 2021
Modularity in Reinforcement Learning via Algorithmic Independence in Credit Assignment
ICML 2021
Bayesian Adaptation for Covariate Shift
NIPS 2021
Offline Reinforcement Learning as One Big Sequence Modeling Problem
NIPS 2021
Information is Power: Intrinsic Control via Information Capture
NIPS 2021
Conservative Data Sharing for Multi-Task Offline Reinforcement Learning
NIPS 2021
Replacing Rewards with Examples: Example-Based Policy Search via Recursive Classification
NIPS 2021
Outcome-Driven Reinforcement Learning via Variational Inference
NIPS 2021
Understanding the World Through Action
CORL 2021
Hierarchically Integrated Models: Learning to Navigate from Heterogeneous Robots
CORL 2021
AW-Opt: Learning Robotic Skills with Imitation andReinforcement at Scale
CORL 2021
BC-Z: Zero-Shot Task Generalization with Robotic Imitation Learning
CORL 2021
Rapid Exploration for Open-World Navigation with Latent Goal Models
CORL 2021
Scaling Up Multi-Task Robotic Reinforcement Learning
CORL 2021
A Workflow for Offline Model-Free Robotic Reinforcement Learning
CORL 2021
Benchmarks for Deep Off-Policy Evaluation
ICLR 2021
Factorizing Declarative and Procedural Knowledge in Structured, Dynamical Environments
ICLR 2021
Offline Model-Based Optimization via Normalized Maximum Likelihood Estimation
ICLR 2021
X2T: Training an X-to-Text Typing Interface with Online Learning from User Feedback
ICLR 2021
C-Learning: Learning to Achieve Goals via Recursive Classification
ICLR 2021
Off-Dynamics Reinforcement Learning: Training for Transfer with Domain Classifiers
ICLR 2021
Conservative Safety Critics for Exploration
ICLR 2021
OPAL: Offline Primitive Discovery for Accelerating Offline Reinforcement Learning
ICLR 2021
Evolving Reinforcement Learning Algorithms
ICLR 2021
Parrot: Data-Driven Behavioral Priors for Reinforcement Learning
ICLR 2021
Decentralized Reinforcement Learning: Global Decision-Making via Local Economic Transactions
ICML 2020
RL-CycleGAN: Reinforcement Learning Aware Simulation-to-Real
CVPR 2020
Learning Predictive Models from Observation and Interaction
ECCV 2020
The Ingredients of Real World Robotic Reinforcement Learning
ICLR 2020
Model Based Reinforcement Learning for Atari
ICLR 2020
Thinking While Moving: Deep Reinforcement Learning with Concurrent Control
ICLR 2020
Deep Imitative Models for Flexible Inference, Planning, and Control
ICLR 2020
Dynamics-Aware Unsupervised Discovery of Skills
ICLR 2020
Dynamical Distance Learning for Semi-Supervised and Unsupervised Skill Discovery
ICLR 2020
Meta-Learning without Memorization
ICLR 2020
VideoFlow: A Conditional Flow-Based Model for Stochastic Video Generation
ICLR 2020
Adversarial Policies: Attacking Deep Reinforcement Learning
ICLR 2020
Watch, Try, Learn: Meta-Learning from Demonstrations and Rewards
ICLR 2020
Reinforcement Learning with Competitive Ensembles of Information-Constrained Primitives
ICLR 2020
The Variational Bandwidth Bottleneck: Stochastic Evaluation on an Information Budget
ICLR 2020
SQIL: Imitation Learning via Reinforcement Learning with Sparse Rewards
ICLR 2020
AVID: Learning Multi-Stage Tasks via Pixel-Level Translation of Human Videos
RSS 2020
Emergent Real-World Robotic Skills via Unsupervised Off-Policy Reinforcement Learning
RSS 2020
Learning Agile Robotic Locomotion Skills by Imitating Animals
RSS 2020
DisCor: Corrective Feedback in Reinforcement Learning via Distribution Correction
NIPS 2020
Long-Horizon Visual Planning with Goal-Conditioned Hierarchical Predictors
NIPS 2020
Rewriting History with Inverse RL: Hindsight Inference for Policy Improvement
NIPS 2020
MOPO: Model-based Offline Policy Optimization
NIPS 2020
Emergent Complexity and Zero-shot Transfer via Unsupervised Environment Design
NIPS 2020
One Solution is Not All You Need: Few-Shot Extrapolation via Structured MaxEnt RL
NIPS 2020
Gradient Surgery for Multi-Task Learning
NIPS 2020
Model Inversion Networks for Model-Based Optimization
NIPS 2020
Continual Learning of Control Primitives : Skill Discovery via Reset-Games
NIPS 2020
Gamma-Models: Generative Temporal Difference Learning for Infinite-Horizon Prediction
NIPS 2020
Can Autonomous Vehicles Identify, Recover From, and Adapt to Distribution Shifts?
ICML 2020
Skew-Fit: State-Covering Self-Supervised Reinforcement Learning
ICML 2020
Learning Human Objectives by Evaluating Hypothetical Behavior
ICML 2020
Cautious Adaptation For Reinforcement Learning in Safety-Critical Settings
ICML 2020
Conservative Q-Learning for Offline Reinforcement Learning
NIPS 2020
Stochastic Latent Actor-Critic: Deep Reinforcement Learning with a Latent Variable Model
NIPS 2020
Inverting the Pose Forecasting Pipeline with SPF2: Sequential Pointcloud Forecasting for Sequential Pose Forecasting
CORL 2020
Reinforcement Learning with Videos: Combining Offline Observations with Interaction
CORL 2020
Assisted Perception: Optimizing Observations to Communicate State
CORL 2020
Learning to Walk in the Real World with Minimal Human Effort
CORL 2020
MELD: Meta-Reinforcement Learning from Images via Latent State Models
CORL 2020
Never Stop Learning: The Effectiveness of Fine-Tuning in Robotic Reinforcement Learning
CORL 2020
Chaining Behaviors from Data with Model-Free Reinforcement Learning
CORL 2020
Learning to Adapt in Dynamic, Real-World Environments through Meta-Reinforcement Learning
ICLR 2019
Near-Optimal Representation Learning for Hierarchical Reinforcement Learning
ICLR 2019
End-To-End Robotic Reinforcement Learning without Reward Engineering
RSS 2019
Contextual Imagined Goals for Self-Supervised Robotic Learning
CORL 2019
Learning to Walk Via Deep Reinforcement Learning
RSS 2019
Wasserstein Dependency Measure for Representation Learning
NIPS 2019
Unsupervised Curricula for Visual Meta-Reinforcement Learning
NIPS 2019
Guided Meta-Policy Search
NIPS 2019
Planning with Goal-Conditioned Policies
NIPS 2019
Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction
NIPS 2019
Off-Policy Evaluation via Off-Policy Classification
NIPS 2019
MCP: Learning Composable Hierarchical Control with Multiplicative Compositional Policies
NIPS 2019
Causal Confusion in Imitation Learning
NIPS 2019
When to Trust Your Model: Model-Based Policy Optimization
NIPS 2019
Search on the Replay Buffer: Bridging Planning and Reinforcement Learning
NIPS 2019
Meta-Learning with Implicit Gradients
NIPS 2019
Compositional Plan Vectors
NIPS 2019
Improvisation through Physical Understanding: Using Novel Objects As Tools with Visual Foresight
RSS 2019
Learning Latent Plans from Play
CORL 2019
Deep Dynamics Models for Learning Dexterous Manipulation
CORL 2019
Entity Abstraction in Visual Model-Based Reinforcement Learning
CORL 2019
Variational Discriminator Bottleneck: Improving Imitation Learning, Inverse RL, and GANs by Constraining Information Flow
ICLR 2019
Learning Actionable Representations with Goal Conditioned Policies
ICLR 2019
Time-Agnostic Prediction: Predicting Predictable Video Frames
ICLR 2019
Recall Traces: Backtracking Models for Efficient Reinforcement Learning
ICLR 2019
Deep Online Learning Via Meta-Learning: Continual Adaptation for Model-Based RL
ICLR 2019
Automatically Composing Representation Transformations as a Means for Generalization
ICLR 2019
Guiding Policies with Language via Meta-Learning
ICLR 2019
Reasoning About Physical Interactions with Object-Oriented Prediction and Planning
ICLR 2019
ROBEL: Robotics Benchmarks for Learning with Low-Cost Robots
CORL 2019
Online Meta-Learning
ICML 2019
Diagnosing Bottlenecks in Deep Q-learning Algorithms
ICML 2019
EMI: Exploration with Mutual Information
ICML 2019
Efficient Off-Policy Meta-Reinforcement Learning via Probabilistic Context Variables
ICML 2019
Learning a Prior over Intent via Meta-Inverse Reinforcement Learning
ICML 2019
SOLAR: Deep Structured Representations for Model-Based Reinforcement Learning
ICML 2019
PRECOG: PREdiction Conditioned on Goals in Visual Multi-Agent Settings
ICCV 2019
Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning
CORL 2019
Relay Policy Learning: Solving Long-Horizon Tasks via Imitation and Reinforcement Learning
CORL 2019
Sim-To-Real via Sim-To-Sim: Data-Efficient Robotic Grasping via Randomized-To-Canonical Adaptation Networks
CVPR 2019
From Language to Goals: Inverse Reinforcement Learning for Vision-Based Instruction Following
ICLR 2019
RoboNet: Large-Scale Multi-Robot Learning
CORL 2019
InfoBot: Transfer and Exploration via the Information Bottleneck
ICLR 2019
Discriminator-Actor-Critic: Addressing Sample Inefficiency and Reward Bias in Adversarial Imitation Learning
ICLR 2019
Unsupervised Learning via Meta-Learning
ICLR 2019
Diversity is All You Need: Learning Skills without a Reward Function
ICLR 2019
Scalable Deep Reinforcement Learning for Vision-Based Robotic Manipulation
CORL 2018
Self-Consistent Trajectory Autoencoder: Hierarchical Reinforcement Learning with Trajectory Embeddings
ICML 2018
Latent Space Policies for Hierarchical Reinforcement Learning
ICML 2018
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
ICML 2018
Regret Minimization for Partially Observable Deep Reinforcement Learning
ICML 2018
Universal Planning Networks: Learning Generalizable Representations for Visuomotor Control
ICML 2018
The Mirage of Action-Dependent Baselines in Reinforcement Learning
ICML 2018
One-Shot Imitation from Observing Humans via Domain-Adaptive Meta-Learning
RSS 2018
Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models
NIPS 2018
Divide-and-Conquer Reinforcement Learning
ICLR 2018
Temporal Difference Models: Model-Free Deep RL for Model-Based Control
ICLR 2018
Learning Robust Rewards with Adverserial Inverse Reinforcement Learning
ICLR 2018
Recasting Gradient-Based Meta-Learning as Hierarchical Bayes
ICLR 2018
Leave no Trace: Learning to Reset for Safe and Autonomous Reinforcement Learning
ICLR 2018
Stochastic Variational Video Prediction
ICLR 2018
Meta-Learning and Universality: Deep Representations and Gradient Descent can Approximate any Learning Algorithm
ICLR 2018
Sim2Real Viewpoint Invariant Visual Servoing by Recurrent Control
CVPR 2018
Data-Efficient Hierarchical Reinforcement Learning
NIPS 2018
Variational Inverse Control with Events: A General Framework for Data-Driven Reward Definition
NIPS 2018
Probabilistic Model-Agnostic Meta-Learning
NIPS 2018
Visual Reinforcement Learning with Imagined Goals
NIPS 2018
Where Do You Think You're Going?: Inferring Beliefs about Dynamics from Behavior
NIPS 2018
Visual Memory for Robust Path Following
NIPS 2018
Meta-Reinforcement Learning of Structured Exploration Strategies
NIPS 2018
Learning with Latent Language
NAACL 2018
Few-Shot Goal Inference for Visuomotor Learning and Planning
CORL 2018
Grasp2Vec: Learning Object Representations from Self-Supervised Grasping
CORL 2018
Composable Action-Conditioned Predictors: Flexible Off-Policy Learning for Robot Navigation
CORL 2018
Robustness via Retrying: Closed-Loop Robotic Manipulation with Self-Supervised Learning
CORL 2018
Learning Complex Dexterous Manipulation with Deep Reinforcement Learning and Demonstrations
RSS 2018
Shared Autonomy via Deep Reinforcement Learning
RSS 2018
The Feeling of Success: Does Touch Sensing Help Predict Grasp Outcomes?
CORL 2017
Self-Supervised Visual Planning with Temporal Skip Connections
CORL 2017
One-Shot Visual Imitation Learning via Meta-Learning
CORL 2017
Modular Multitask Reinforcement Learning with Policy Sketches
ICML 2017
End-to-End Learning of Semantic Grasping
CORL 2017
GPLAC: Generalizing Vision-Based Robotic Skills Using Weakly Labeled Images
ICCV 2017
CAD2RL: Real Single-Image Flight Without a Single Real Image
RSS 2017
Unsupervised Perceptual Rewards for Imitation Learning
RSS 2017
Reinforcement Learning with Deep Energy-Based Policies
ICML 2017
Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks
ICML 2017
Combining Model-Based and Model-Free Updates for Trajectory-Centric Reinforcement Learning
ICML 2017
Value Iteration Networks
IJCAI 2017
Interpolated Policy Gradient: Merging On-Policy and Off-Policy Gradient Estimation for Deep Reinforcement Learning
NIPS 2017
EX2: Exploration with Exemplar Models for Deep Reinforcement Learning
NIPS 2017
Cognitive Mapping and Planning for Visual Navigation
CVPR 2017
Learning Robotic Manipulation of Granular Media
CORL 2017
Learning to Poke by Poking: Experiential Learning of Intuitive Physics
NIPS 2016
Unsupervised Learning for Physical Interaction through Video Prediction
NIPS 2016
Continuous Deep Q-Learning with Model-based Acceleration
ICML 2016
Guided Cost Learning: Deep Inverse Optimal Control via Policy Optimization
ICML 2016
End-to-End Training of Deep Visuomotor Policies
JMLR 2016
Backprop KF: Learning Discriminative Deterministic State Estimators
NIPS 2016
Guided Policy Search via Approximate Mirror Descent
NIPS 2016
Value Iteration Networks
NIPS 2016
Recurrent Network Models for Human Dynamics
ICCV 2015
Trust Region Policy Optimization
ICML 2015
Learning Neural Network Policies with Guided Policy Search under Unknown Dynamics
NIPS 2014
Learning Complex Neural Network Policies with Trajectory Optimization
ICML 2014
Guided Policy Search
ICML 2013
Variational Policy Search via Trajectory Optimization
NIPS 2013
Nonlinear Inverse Reinforcement Learning with Gaussian Processes
NIPS 2011
Feature Construction for Inverse Reinforcement Learning
NIPS 2010