Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Methods
Reinforcement Learning
›
Methods
›
Policy Learning
2068 directly classified papers
Papers per year
2002: 6
2003: 1
2004: 1
2006: 11
2007: 10
2008: 14
2009: 9
2010: 23
2011: 15
2012: 25
2013: 25
2014: 24
2015: 23
2016: 27
2017: 61
2018: 107
2019: 187
2020: 216
2021: 274
2022: 259
2023: 321
2024: 247
2025: 153
2026: 29
Papers
Goal Recognition as Reinforcement Learning
AAAI 2022
Improved Algorithms for Misspecified Linear Markov Decision Processes
AISTATS 2022
A Few Expert Queries Suffices for Sample-Efficient RL with Resets and Linear Value Approximation
NIPS 2022
A Theoretical Understanding of Gradient Bias in Meta-Reinforcement Learning
NIPS 2022
How to Reduce Action Space for Planning Domains? (Student Abstract)
AAAI 2022
Exponential Family Model-Based Reinforcement Learning via Score Matching
NIPS 2022
On the Convergence Rate of Off-Policy Policy Optimization Methods with Density-Ratio Correction
AISTATS 2022
Reinforcement Learning Explainability via Model Transforms (Student Abstract)
AAAI 2022
Robust Anytime Learning of Markov Decision Processes
NIPS 2022
Data augmentation for efficient learning from parametric experts
NIPS 2022
PALMER: Perception - Action Loop with Memory for Long-Horizon Planning
NIPS 2022
Confident Approximate Policy Iteration for Efficient Local Planning in $q^\pi$-realizable MDPs
NIPS 2022
Generalizing Goal-Conditioned Reinforcement Learning with Variational Causal Reasoning
NIPS 2022
Robust Action Gap Increasing with Clipped Advantage Learning
AAAI 2022
Conservative Dual Policy Optimization for Efficient Model-Based Reinforcement Learning
NIPS 2022
CUP: Critic-Guided Policy Reuse
NIPS 2022
Reward-Weighted Regression Converges to a Global Optimum
AAAI 2022
Sample-Efficient Reinforcement Learning via Conservative Model-Based Actor-Critic
AAAI 2022
Convergence and Optimality of Policy Gradient Methods in Weakly Smooth Settings
AAAI 2022
Adaptive Pairwise Weights for Temporal Credit Assignment
AAAI 2022
Non-Markovian Reward Modelling from Trajectory Labels via Interpretable Multiple Instance Learning
NIPS 2022
Lifelong Hyper-Policy Optimization with Multiple Importance Sampling Regularization
AAAI 2022
Constraint Sampling Reinforcement Learning: Incorporating Expertise for Faster Learning
AAAI 2022
Episodic Policy Gradient Training
AAAI 2022
Unsupervised Reinforcement Learning in Multiple Environments
AAAI 2022
<
1
…
36
37
38
…
83
>