Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Methods
Reinforcement Learning
›
Methods
›
Policy Learning
2068 directly classified papers
Papers per year
2002: 6
2003: 1
2004: 1
2006: 11
2007: 10
2008: 14
2009: 9
2010: 23
2011: 15
2012: 25
2013: 25
2014: 24
2015: 23
2016: 27
2017: 61
2018: 107
2019: 187
2020: 216
2021: 274
2022: 259
2023: 321
2024: 247
2025: 153
2026: 29
Papers
Policy Optimization for $\mathcal{H}_2$ Linear Control with $\mathcal{H}_\infty$ Robustness Guarantee: Implicit Regularization and Global Convergence
L4DC 2020
Lambda-Policy Iteration with Randomization for Contractive Models with Infinite Policies: Well-Posedness and Convergence
L4DC 2020
Distributed Reinforcement Learning for Decentralized Linear Quadratic Control: A Derivative-Free Policy Optimization Approach
L4DC 2020
Model-free Reinforcement Learning in Infinite-horizon Average-reward Markov Decision Processes
ICML 2020
GTI: Learning to Generalize across Long-Horizon Tasks from Human Demonstrations
RSS 2020
Reinforcement Learning based Control of Imitative Policies for Near-Accident Driving
RSS 2020
Adaptive Smoothing for Path Integral Control
JMLR 2020
Pontryagin Differentiable Programming: An End-to-End Learning and Control Framework
NIPS 2020
Natural Policy Gradient Primal-Dual Method for Constrained Markov Decision Processes
NIPS 2020
A Boolean Task Algebra for Reinforcement Learning
NIPS 2020
f-IRL: Inverse Reinforcement Learning via State Marginal Matching
CORL 2020
Efficient Deep Reinforcement Learning via Adaptive Policy Transfer
IJCAI 2020
Learning Intrinsic Rewards as a Bi-Level Optimization Problem
UAI 2020
Unknown mixing times in apprenticeship and reinforcement learning
UAI 2020
SVRG for Policy Evaluation with Fewer Gradient Evaluations
IJCAI 2020
An Efficient Asynchronous Method for Integrating Evolutionary and Gradient-based Policy Search
NIPS 2020
Intrinsic Reward Driven Imitation Learning via Generative Model
ICML 2020
Learning the Globally Optimal Distributed LQ Regulator
L4DC 2020
Production-based Cognitive Models as a Test Suite for Reinforcement Learning Algorithms
EMNLP 2020
Rethinking Supervised Learning and Reinforcement Learning in Task-Oriented Dialogue Systems
EMNLP 2020
Learning to Generalize for Sequential Decision Making
EMNLP 2020
Task-Completion Dialogue Policy Learning via Monte Carlo Tree Search with Dueling Network
EMNLP 2020
Planning in Hierarchical Reinforcement Learning: Guarantees for Using Local Policies
ALT 2020
Causal Imitation Learning With Unobserved Confounders
NIPS 2020
Adversarial Soft Advantage Fitting: Imitation Learning without Policy Optimization
NIPS 2020
<
1
…
54
55
56
…
83
>