Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Methods
Reinforcement Learning
›
Methods
›
Policy Learning
2068 directly classified papers
Papers per year
2002: 6
2003: 1
2004: 1
2006: 11
2007: 10
2008: 14
2009: 9
2010: 23
2011: 15
2012: 25
2013: 25
2014: 24
2015: 23
2016: 27
2017: 61
2018: 107
2019: 187
2020: 216
2021: 274
2022: 259
2023: 321
2024: 247
2025: 153
2026: 29
Papers
PAGE-PG: A Simple and Loopless Variance-Reduced Policy Gradient Method with Probabilistic Gradient Estimation
ICML 2022
Revisiting Some Common Practices in Cooperative Multi-Agent Reinforcement Learning
ICML 2022
Simple Agent, Complex Environment: Efficient Reinforcement Learning with Agent States
JMLR 2022
Independent Policy Gradient for Large-Scale Markov Potential Games: Sharper Rates, Function Approximation, and Game-Agnostic Convergence
ICML 2022
Balancing Sample Efficiency and Suboptimality in Inverse Reinforcement Learning
ICML 2022
Admissible Policy Teaching through Reward Design
AAAI 2022
Human-in-the-loop: Provably Efficient Preference-based Reinforcement Learning with General Function Approximation
ICML 2022
Learning Infinite-horizon Average-reward Markov Decision Process with Constraints
ICML 2022
Optimal Transport for Stationary Markov Chains via Policy Iteration
JMLR 2022
On the Hidden Biases of Policy Mirror Ascent in Continuous Action Spaces
ICML 2022
Intrinsically Motivated Goal Exploration Processes with Automatic Curriculum Learning
JMLR 2022
Differentially Private Regret Minimization in Episodic Markov Decision Processes
AAAI 2022
Approximate Information State for Approximate Planning and Reinforcement Learning in Partially Observed Systems
JMLR 2022
Logarithmic Regret for Episodic Continuous-Time Linear-Quadratic Reinforcement Learning over a Finite-Time Horizon
JMLR 2022
Planning with Participation Constraints
AAAI 2022
Reinforcement Learning with Stochastic Reward Machines
AAAI 2022
Goal Recognition as Reinforcement Learning
AAAI 2022
Inferring Lexicographically-Ordered Rewards from Preferences
AAAI 2022
Near Optimality of Finite Memory Feedback Policies in Partially Observed Markov Decision Processes
JMLR 2022
Dynamic Dialogue Policy for Continual Reinforcement Learning
COLING 2022
A Dual Approach to Constrained Markov Decision Processes with Entropy Regularization
AISTATS 2022
System-Agnostic Meta-Learning for MDP-based Dynamic Scheduling via Descriptive Policy
AISTATS 2022
A general sample complexity analysis of vanilla policy gradient
AISTATS 2022
Primal-Dual Stochastic Mirror Descent for MDPs
AISTATS 2022
Polynomial Time Reinforcement Learning in Factored State MDPs with Linear Value Functions
AISTATS 2022
<
1
…
32
33
34
…
83
>