Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Methods
Reinforcement Learning
›
Methods
›
Policy Learning
2068 directly classified papers
Papers per year
2002: 6
2003: 1
2004: 1
2006: 11
2007: 10
2008: 14
2009: 9
2010: 23
2011: 15
2012: 25
2013: 25
2014: 24
2015: 23
2016: 27
2017: 61
2018: 107
2019: 187
2020: 216
2021: 274
2022: 259
2023: 321
2024: 247
2025: 153
2026: 29
Papers
Marginal Utility for Planning in Continuous or Large Discrete Action Spaces
NIPS 2020
Online learning with dynamics: A minimax perspective
NIPS 2020
Derivative-Free Methods for Policy Optimization: Guarantees for Linear Quadratic Systems
JMLR 2020
Upper Confidence Primal-Dual Reinforcement Learning for CMDP with Adversarial Loss
NIPS 2020
Information Theoretic Regret Bounds for Online Nonlinear Control
NIPS 2020
Logarithmic Regret Bound in Partially Observable Linear Dynamical Systems
NIPS 2020
Can We Learn Heuristics for Graphical Model Inference Using Reinforcement Learning?
CVPR 2020
Lifelong Learning with a Changing Action Set
AAAI 2020
Combining Cognitive Modeling and Reinforcement Learning for Clarification in Dialogue
COLING 2020
Elaborating on Learned Demonstrations with Temporal Logic Specifications
RSS 2020
Model-Based Reinforcement Learning with a Generative Model is Minimax Optimal
COLT 2020
Optimality and Approximation with Policy Gradient Methods in Markov Decision Processes
COLT 2020
Stable Policy Optimization via Off-Policy Divergence Regularization
UAI 2020
Policy learning in SE(3) action spaces
CORL 2020
Semi-Supervised Dialogue Policy Learning via Stochastic Reward Estimation
ACL 2020
Conversational Graph Grounded Policy Learning for Open-Domain Conversation Generation
ACL 2020
Pessimism About Unknown Unknowns Inspires Conservatism
COLT 2020
Hardware as Policy: Mechanical and Computational Co-Optimization using Deep Reinforcement Learning
CORL 2020
Learning Efficient Dialogue Policy from Demonstrations through Shaping
ACL 2020
Stylized Text Generation: Approaches and Applications
ACL 2020
Interactive Imitation Learning in State-Space
CORL 2020
Hierarchical Decomposition of Nonlinear Dynamics and Control for System Identification and Policy Distillation
L4DC 2020
A Duality Approach for Regret Minimization in Average-Award Ergodic Markov Decision Processes
L4DC 2020
Optimistic robust linear quadratic dual control
L4DC 2020
Partially Observable Markov Decision Process Modelling for Assessing Hierarchies
ACML 2020
<
1
…
57
58
59
…
83
>