Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Methods
Reinforcement Learning
›
Methods
›
Policy Learning
2068 directly classified papers
Papers per year
2002: 6
2003: 1
2004: 1
2006: 11
2007: 10
2008: 14
2009: 9
2010: 23
2011: 15
2012: 25
2013: 25
2014: 24
2015: 23
2016: 27
2017: 61
2018: 107
2019: 187
2020: 216
2021: 274
2022: 259
2023: 321
2024: 247
2025: 153
2026: 29
Papers
Policy Optimization with Second-Order Advantage Information
IJCAI 2018
Finite Sample Analysis of LSTD with Random Projections and Eligibility Traces
IJCAI 2018
Maximum Causal Tsallis Entropy Imitation Learning
NIPS 2018
Time Limits in Reinforcement Learning
ICML 2018
The Uncertainty Bellman Equation and Exploration
ICML 2018
Hierarchical Imitation and Reinforcement Learning
ICML 2018
Online Robust Policy Learning in the Presence of Unknown Adversaries
NIPS 2018
Credit Assignment For Collective Multiagent RL With Global Rewards
NIPS 2018
Learning Beam Search Policies via Imitation Learning
NIPS 2018
Diversity-Driven Exploration Strategy for Deep Reinforcement Learning
NIPS 2018
Learning Safe Policies with Expert Guidance
NIPS 2018
Improving Exploration in Evolution Strategies for Deep Reinforcement Learning via a Population of Novelty-Seeking Agents
NIPS 2018
Differentiable MPC for End-to-end Planning and Control
NIPS 2018
Using Reward Machines for High-Level Task Specification and Decomposition in Reinforcement Learning
ICML 2018
Learning to Act in Decentralized Partially Observable MDPs
ICML 2018
Occam's razor is insufficient to infer the preferences of irrational agents
NIPS 2018
Verifiable Reinforcement Learning via Policy Extraction
NIPS 2018
Markov Decision Processes with Continuous Side Information
ALT 2018
Convergence of Value Aggregation for Imitation Learning
AISTATS 2018
Multi-Level Policy and Reward Reinforcement Learning for Image Captioning
IJCAI 2018
A Unified Approach for Multi-step Temporal-Difference Learning with Eligibility Traces in Reinforcement Learning
IJCAI 2018
Global Convergence of Policy Gradient Methods for the Linear Quadratic Regulator
ICML 2018
Learning Environmental Calibration Actions for Policy Self-Evolution
IJCAI 2018
On Learning Intrinsic Rewards for Policy Gradient Methods
NIPS 2018
Fast Model Identification via Physics Engines for Data-Efficient Policy Search
IJCAI 2018
<
1
…
69
70
71
…
83
>