Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Methods
Reinforcement Learning
›
Methods
›
Policy Learning
2068 directly classified papers
Papers per year
2002: 6
2003: 1
2004: 1
2006: 11
2007: 10
2008: 14
2009: 9
2010: 23
2011: 15
2012: 25
2013: 25
2014: 24
2015: 23
2016: 27
2017: 61
2018: 107
2019: 187
2020: 216
2021: 274
2022: 259
2023: 321
2024: 247
2025: 153
2026: 29
Papers
Safe Policy Learning for Continuous Control
CORL 2020
On the Global Convergence Rates of Softmax Policy Gradient Methods
ICML 2020
Learning to Compose Hierarchical Object-Centric Controllers for Robotic Manipulation
CORL 2020
BAIL: Best-Action Imitation Learning for Batch Deep Reinforcement Learning
NIPS 2020
Preference-based Reinforcement Learning with Finite-Time Guarantees
NIPS 2020
R-learning in actor-critic model offers a biologically relevant mechanism for sequential decision-making
NIPS 2020
Constrained Reinforcement Learning via Policy Splitting
ACML 2020
Adaptive Trust Region Policy Optimization: Global Convergence and Faster Rates for Regularized MDPs
AAAI 2020
Discretizing Continuous Action Space for On-Policy Optimization
AAAI 2020
Attentive Experience Replay
AAAI 2020
Exploring Data Aggregation in Policy Learning for Vision-Based Urban Autonomous Driving
CVPR 2020
How to Learn a Useful Critic? Model-based Action-Gradient-Estimator Policy Optimization
NIPS 2020
Learning Guidance Rewards with Trajectory-space Smoothing
NIPS 2020
Discovering Reinforcement Learning Algorithms
NIPS 2020
Experimental design for MRI by greedy policy search
NIPS 2020
Finite-Memory Near-Optimal Learning for Markov Decision Processes with Long-Run Average Reward
UAI 2020
No-Regret Exploration in Goal-Oriented Reinforcement Learning
ICML 2020
GraphOpt: Learning Optimization Models of Graph Formation
ICML 2020
Safe Reinforcement Learning in Constrained Markov Decision Processes
ICML 2020
Learning to search efficiently for causally near-optimal treatments
NIPS 2020
Forethought and Hindsight in Credit Assignment
NIPS 2020
Inverse Reinforcement Learning from a Gradient-based Learner
NIPS 2020
Guided Dialogue Policy Learning without Adversarial Learning in the Loop
EMNLP 2020
Incorporating Stylistic Lexical Preferences in Generative Language Models
EMNLP 2020
Modeling Protagonist Emotions for Emotion-Aware Storytelling
EMNLP 2020
<
1
…
58
59
60
…
83
>