Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Methods
Reinforcement Learning
›
Methods
›
Policy Learning
2068 directly classified papers
Papers per year
2002: 6
2003: 1
2004: 1
2006: 11
2007: 10
2008: 14
2009: 9
2010: 23
2011: 15
2012: 25
2013: 25
2014: 24
2015: 23
2016: 27
2017: 61
2018: 107
2019: 187
2020: 216
2021: 274
2022: 259
2023: 321
2024: 247
2025: 153
2026: 29
Papers
Globally Convergent Policy Search for Output Estimation
NIPS 2022
Conservative Dual Policy Optimization for Efficient Model-Based Reinforcement Learning
NIPS 2022
Exploring through Random Curiosity with General Value Functions
NIPS 2022
Robust Imitation of a Few Demonstrations with a Backwards Model
NIPS 2022
On the Convergence Rate of Off-Policy Policy Optimization Methods with Density-Ratio Correction
AISTATS 2022
TaSIL: Taylor Series Imitation Learning
NIPS 2022
Confident Approximate Policy Iteration for Efficient Local Planning in $q^\pi$-realizable MDPs
NIPS 2022
Robust Anytime Learning of Markov Decision Processes
NIPS 2022
On Reinforcement Learning and Distribution Matching for Fine-Tuning Language Models with no Catastrophic Forgetting
NIPS 2022
Learning to Grasp the Ungraspable with Emergent Extrinsic Dexterity
CORL 2022
Discovered Policy Optimisation
NIPS 2022
USHER: Unbiased Sampling for Hindsight Experience Replay
CORL 2022
Plan To Predict: Learning an Uncertainty-Foreseeing Model For Model-Based Reinforcement Learning
NIPS 2022
Semi-infinitely Constrained Markov Decision Processes
NIPS 2022
Anchor-Changing Regularized Natural Policy Gradient for Multi-Objective Reinforcement Learning
NIPS 2022
A Unified Framework for Alternating Offline Model Training and Policy Learning
NIPS 2022
Sequence Model Imitation Learning with Unobserved Contexts
NIPS 2022
A Few Expert Queries Suffices for Sample-Efficient RL with Resets and Linear Value Approximation
NIPS 2022
MDPGT: Momentum-Based Decentralized Policy Gradient Tracking
AAAI 2022
Value Function Approximations via Kernel Embeddings for No-Regret Reinforcement Learning
ACML 2022
Goal Recognition as Reinforcement Learning
AAAI 2022
Adaptive Pairwise Weights for Temporal Credit Assignment
AAAI 2022
Reinforcement Learning Explainability via Model Transforms (Student Abstract)
AAAI 2022
Convergence and Optimality of Policy Gradient Methods in Weakly Smooth Settings
AAAI 2022
Robust Action Gap Increasing with Clipped Advantage Learning
AAAI 2022
<
1
…
37
38
39
…
83
>