Reinforcement Learning › Methods ›

Policy Learning

2068 directly classified papers

Papers per year

Papers

The Importance of Non-Markovianity in Maximum State Entropy Exploration ICML 2022

Recursive Reinforcement Learning NIPS 2022

Giving Feedback on Interactive Student Programs with Meta-Exploration NIPS 2022

Versatile Offline Imitation from Observations and Examples via Regularized State-Occupancy Matching ICML 2022

Constrained Variational Policy Optimization for Safe Reinforcement Learning ICML 2022

Delayed Reinforcement Learning by Imitation ICML 2022

Difference Advantage Estimation for Multi-Agent Policy Gradients ICML 2022

PALMER: Perception - Action Loop with Memory for Long-Horizon Planning NIPS 2022

An Analytical Update Rule for General Policy Optimization ICML 2022

Convergence of Policy Gradient for Entropy Regularized MDPs with Neural Network Approximation in the Mean-Field Regime ICML 2022

Global Convergence of Direct Policy Search for State-Feedback $\mathcal{H}_\infty$ Robust Control: A Revisit of Nonsmooth Synthesis with Goldstein Subdifferential NIPS 2022

Curriculum Reinforcement Learning via Constrained Optimal Transport ICML 2022

Improving Policy Optimization with Generalist-Specialist Learning ICML 2022

Near-Optimal Goal-Oriented Reinforcement Learning in Non-Stationary Environments NIPS 2022

How to talk so AI will learn: Instructions, descriptions, and autonomy NIPS 2022

LAPO: Latent-Variable Advantage-Weighted Policy Optimization for Offline Reinforcement Learning NIPS 2022

Action-Sufficient State Representation Learning for Control with Structural Constraints ICML 2022

Bisimulation Makes Analogies in Goal-Conditioned Reinforcement Learning ICML 2022

Distributional Reinforcement Learning for Risk-Sensitive Policies NIPS 2022

A Parametric Class of Approximate Gradient Updates for Policy Optimization ICML 2022

Mirror Learning: A Unifying Framework of Policy Optimisation ICML 2022

A Theoretical Understanding of Gradient Bias in Meta-Reinforcement Learning NIPS 2022

PAGE-PG: A Simple and Loopless Variance-Reduced Policy Gradient Method with Probabilistic Gradient Estimation ICML 2022

Revisiting Some Common Practices in Cooperative Multi-Agent Reinforcement Learning ICML 2022

On the Convergence Rates of Policy Gradient Methods JMLR 2022