Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Methods
Reinforcement Learning
›
Methods
›
Policy Learning
2068 directly classified papers
Papers per year
2002: 6
2003: 1
2004: 1
2006: 11
2007: 10
2008: 14
2009: 9
2010: 23
2011: 15
2012: 25
2013: 25
2014: 24
2015: 23
2016: 27
2017: 61
2018: 107
2019: 187
2020: 216
2021: 274
2022: 259
2023: 321
2024: 247
2025: 153
2026: 29
Papers
A Generalized Algorithm for Multi-Objective Reinforcement Learning and Policy Adaptation
NIPS 2019
DAC: The Double Actor-Critic Architecture for Learning Options
NIPS 2019
Imitation-Projected Programmatic Reinforcement Learning
NIPS 2019
Keeping Your Distance: Solving Sparse Reward Tasks Using Self-Balancing Shaped Rewards
NIPS 2019
Learning from Trajectories via Subgoal Discovery
NIPS 2019
Distributional Policy Optimization: An Alternative Approach for Continuous Control
NIPS 2019
Fast Efficient Hyperparameter Tuning for Policy Gradient Methods
NIPS 2019
Hierarchical Reinforcement Learning with Advantage-Based Auxiliary Rewards
NIPS 2019
Surrogate Objectives for Batch Policy Optimization in One-step Decision Making
NIPS 2019
Causal Confusion in Imitation Learning
NIPS 2019
Mo' States Mo' Problems: Emergency Stop Mechanisms from Observation
NIPS 2019
Provably Global Convergence of Actor-Critic: A Case for Linear Quadratic Regulator with Ergodic Cost
NIPS 2019
Better Exploration with Optimistic Actor Critic
NIPS 2019
A Family of Robust Stochastic Operators for Reinforcement Learning
NIPS 2019
Trust Region-Guided Proximal Policy Optimization
NIPS 2019
Almost Horizon-Free Structure-Aware Best Policy Identification with a Generative Model
NIPS 2019
Finite-time Analysis of Approximate Policy Iteration for the Linear Quadratic Regulator
NIPS 2019
Goal-conditioned Imitation Learning
NIPS 2019
Convergent Policy Optimization for Safe Reinforcement Learning
NIPS 2019
Finite-Time Performance Bounds and Adaptive Learning Rate Selection for Two Time-Scale Reinforcement Learning
NIPS 2019
Imitation Learning from Observations by Minimizing Inverse Dynamics Disagreement
NIPS 2019
Maximum Expected Hitting Cost of a Markov Decision Process and Informativeness of Rewards
NIPS 2019
Robust exploration in linear quadratic reinforcement learning
NIPS 2019
Adaptive Auxiliary Task Weighting for Reinforcement Learning
NIPS 2019
Adaptive Temporal-Difference Learning for Policy Evaluation with Per-State Uncertainty Estimates
NIPS 2019
<
1
…
66
67
68
…
83
>