Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Methods
Reinforcement Learning
›
Methods
›
Policy Learning
2068 directly classified papers
Papers per year
2002: 6
2003: 1
2004: 1
2006: 11
2007: 10
2008: 14
2009: 9
2010: 23
2011: 15
2012: 25
2013: 25
2014: 24
2015: 23
2016: 27
2017: 61
2018: 107
2019: 187
2020: 216
2021: 274
2022: 259
2023: 321
2024: 247
2025: 153
2026: 29
Papers
The Importance of Non-Markovianity in Maximum State Entropy Exploration
ICML 2022
Recursive Reinforcement Learning
NIPS 2022
Giving Feedback on Interactive Student Programs with Meta-Exploration
NIPS 2022
Versatile Offline Imitation from Observations and Examples via Regularized State-Occupancy Matching
ICML 2022
Constrained Variational Policy Optimization for Safe Reinforcement Learning
ICML 2022
Delayed Reinforcement Learning by Imitation
ICML 2022
Difference Advantage Estimation for Multi-Agent Policy Gradients
ICML 2022
PALMER: Perception - Action Loop with Memory for Long-Horizon Planning
NIPS 2022
An Analytical Update Rule for General Policy Optimization
ICML 2022
Convergence of Policy Gradient for Entropy Regularized MDPs with Neural Network Approximation in the Mean-Field Regime
ICML 2022
Global Convergence of Direct Policy Search for State-Feedback $\mathcal{H}_\infty$ Robust Control: A Revisit of Nonsmooth Synthesis with Goldstein Subdifferential
NIPS 2022
Curriculum Reinforcement Learning via Constrained Optimal Transport
ICML 2022
Improving Policy Optimization with Generalist-Specialist Learning
ICML 2022
Near-Optimal Goal-Oriented Reinforcement Learning in Non-Stationary Environments
NIPS 2022
How to talk so AI will learn: Instructions, descriptions, and autonomy
NIPS 2022
LAPO: Latent-Variable Advantage-Weighted Policy Optimization for Offline Reinforcement Learning
NIPS 2022
Action-Sufficient State Representation Learning for Control with Structural Constraints
ICML 2022
Bisimulation Makes Analogies in Goal-Conditioned Reinforcement Learning
ICML 2022
Distributional Reinforcement Learning for Risk-Sensitive Policies
NIPS 2022
A Parametric Class of Approximate Gradient Updates for Policy Optimization
ICML 2022
Mirror Learning: A Unifying Framework of Policy Optimisation
ICML 2022
A Theoretical Understanding of Gradient Bias in Meta-Reinforcement Learning
NIPS 2022
PAGE-PG: A Simple and Loopless Variance-Reduced Policy Gradient Method with Probabilistic Gradient Estimation
ICML 2022
Revisiting Some Common Practices in Cooperative Multi-Agent Reinforcement Learning
ICML 2022
On the Convergence Rates of Policy Gradient Methods
JMLR 2022
<
1
…
31
32
33
…
83
>