Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Methods
Reinforcement Learning
›
Methods
›
Policy Learning
2068 directly classified papers
Papers per year
2002: 6
2003: 1
2004: 1
2006: 11
2007: 10
2008: 14
2009: 9
2010: 23
2011: 15
2012: 25
2013: 25
2014: 24
2015: 23
2016: 27
2017: 61
2018: 107
2019: 187
2020: 216
2021: 274
2022: 259
2023: 321
2024: 247
2025: 153
2026: 29
Papers
A General Learning Framework for Open Ad Hoc Teamwork Using Graph-based Policy Learning
JMLR 2023
Double Duality: Variational Primal-Dual Policy Optimization for Constrained Reinforcement Learning
JMLR 2023
Q-Learning for MDPs with General Spaces: Convergence and Near Optimality via Quantization under Weak Continuity
JMLR 2023
Single Timescale Actor-Critic Method to Solve the Linear Quadratic Regulator with Convergence Guarantees
JMLR 2023
Online Reinforcement Learning with Uncertain Episode Lengths
AAAI 2023
Weighted Policy Constraints for Offline Reinforcement Learning
AAAI 2023
Heuristic Search in Dual Space for Constrained Fixed-Horizon POMDPs with Durative Actions
AAAI 2023
On the Convergence of SARSA with Linear Function Approximation
ICML 2023
Algorithm for Constrained Markov Decision Process with Linear Convergence
AISTATS 2023
Adaptive Barrier Smoothing for First-Order Policy Gradient with Contact Dynamics
ICML 2023
Actor-Critic Alignment for Offline-to-Online Reinforcement Learning
ICML 2023
On the Global Convergence of Risk-Averse Policy Gradient Methods with Expected Conditional Risk Measures
ICML 2023
Boosting Offline Reinforcement Learning with Action Preference Query
ICML 2023
Q-learning Decision Transformer: Leveraging Dynamic Programming for Conditional Sequence Modelling in Offline RL
ICML 2023
Regret Bounds for Markov Decision Processes with Recursive Optimized Certainty Equivalents
ICML 2023
Provably Efficient Primal-Dual Reinforcement Learning for CMDPs with Non-stationary Objectives and Constraints
AAAI 2023
Optimal Goal-Reaching Reinforcement Learning via Quasimetric Learning
ICML 2023
Policy Gradient in Robust MDPs with Global Convergence Guarantee
ICML 2023
Eventual Discounting Temporal Logic Counterfactual Experience Replay
ICML 2023
Reinforcement Learning with History Dependent Dynamic Contexts
ICML 2023
VA-learning as a more efficient alternative to Q-learning
ICML 2023
Planning and Learning with Adaptive Lookahead
AAAI 2023
DoMo-AC: Doubly Multi-step Off-policy Actor-Critic Algorithm
ICML 2023
On the Effectiveness of Offline RL for Dialogue Response Generation
ICML 2023
TGRL: An Algorithm for Teacher Guided Reinforcement Learning
ICML 2023
<
1
…
24
25
26
…
83
>