Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Methods
Reinforcement Learning
›
Methods
›
Policy Learning
2068 directly classified papers
Papers per year
2002: 6
2003: 1
2004: 1
2006: 11
2007: 10
2008: 14
2009: 9
2010: 23
2011: 15
2012: 25
2013: 25
2014: 24
2015: 23
2016: 27
2017: 61
2018: 107
2019: 187
2020: 216
2021: 274
2022: 259
2023: 321
2024: 247
2025: 153
2026: 29
Papers
Consistent Aggregation of Objectives with Diverse Time Preferences Requires Non-Markovian Rewards
NIPS 2023
Bayesian Learning of Optimal Policies in Markov Decision Processes with Countably Infinite State-Space
NIPS 2023
Near-optimal Conservative Exploration in Reinforcement Learning under Episode-wise Constraints
ICML 2023
Delay-Adapted Policy Optimization and Improved Regret for Adversarial MDP with Delayed Bandit Feedback
ICML 2023
A Simple Solution for Offline Imitation from Observations and Examples with Possibly Incomplete Trajectories
NIPS 2023
Reward-Mixing MDPs with Few Latent Contexts are Learnable
ICML 2023
Hierarchical Imitation Learning with Vector Quantized Models
ICML 2023
Reward Scale Robustness for Proximal Policy Optimization via DreamerV3 Tricks
NIPS 2023
Variational Curriculum Reinforcement Learning for Unsupervised Discovery of Skills
ICML 2023
LESSON: Learning to Integrate Exploration Strategies for Reinforcement Learning via an Option Framework
ICML 2023
Demonstration-free Autonomous Reinforcement Learning via Implicit and Bidirectional Curriculum
ICML 2023
Langevin Thompson Sampling with Logarithmic Communication: Bandits and Reinforcement Learning
ICML 2023
Natural Actor-Critic for Robust Reinforcement Learning with Function Approximation
NIPS 2023
Sample Efficient Reinforcement Learning in Mixed Systems through Augmented Samples and Its Applications to Queueing Networks
NIPS 2023
Provably Efficient Offline Goal-Conditioned Reinforcement Learning with General Function Approximation and Single-Policy Concentrability
NIPS 2023
Policy Contrastive Imitation Learning
ICML 2023
Reparameterized Policy Learning for Multimodal Trajectory Optimization
ICML 2023
Towards Hierarchical Policy Learning for Conversational Recommendation with Hypergraph-based Reinforcement Learning
IJCAI 2023
For Pre-Trained Vision Models in Motor Control, Not All Policy Learning Methods are Created Equal
ICML 2023
Why Target Networks Stabilise Temporal Difference Methods
ICML 2023
Stochastic Policy Gradient Methods: Improved Sample Complexity for Fisher-non-degenerate Policies
ICML 2023
Optimizing DDPM Sampling with Shortcut Fine-Tuning
ICML 2023
Warm-Start Actor-Critic: From Approximation Error to Sub-optimality Gap
ICML 2023
A Connection between One-Step RL and Critic Regularization in Reinforcement Learning
ICML 2023
Does Sparsity Help in Learning Misspecified Linear Bandits?
ICML 2023
<
1
…
28
29
30
…
83
>