Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Methods
Reinforcement Learning
›
Methods
›
Policy Learning
2068 directly classified papers
Papers per year
2002: 6
2003: 1
2004: 1
2006: 11
2007: 10
2008: 14
2009: 9
2010: 23
2011: 15
2012: 25
2013: 25
2014: 24
2015: 23
2016: 27
2017: 61
2018: 107
2019: 187
2020: 216
2021: 274
2022: 259
2023: 321
2024: 247
2025: 153
2026: 29
Papers
Learning to Switch Optimizers for Quadratic Programming
ACML 2021
The benefits of sharing: a cloud-aided performance-driven framework to learn optimal feedback policies
L4DC 2021
Adaptive Risk Sensitive Model Predictive Control with Stochastic Search
L4DC 2021
Primal-dual Learning for the Model-free Risk-constrained Linear Quadratic Regulator
L4DC 2021
Implicit Behavioral Cloning
CORL 2021
XIRL: Cross-embodiment Inverse Reinforcement Learning
CORL 2021
Specializing Versatile Skill Libraries using Local Mixture of Experts
CORL 2021
SCAPE: Learning Stiffness Control from Augmented Position Control Experiences
CORL 2021
You Only Evaluate Once: a Simple Baseline Algorithm for Offline RL
CORL 2021
Robust reinforcement learning under minimax regret for green security
UAI 2021
Near-Optimal Model-Free Reinforcement Learning in Non-Stationary Episodic MDPs
ICML 2021
Escaping from zero gradient: Revisiting action-constrained reinforcement learning via Frank-Wolfe policy optimization
UAI 2021
Known unknowns: Learning novel concepts using reasoning-by-elimination
UAI 2021
A decentralized policy gradient approach to multi-task reinforcement learning
UAI 2021
CLAIM: curriculum learning policy for influence maximization in unknown social networks
UAI 2021
Contingency-aware influence maximization: A reinforcement learning approach
UAI 2021
Contextual policy transfer in reinforcement learning domains via deep mixtures-of-experts
UAI 2021
Explaining fast improvement in online imitation learning
UAI 2021
Softmax Policy Gradient Methods Can Take Exponential Time to Converge
COLT 2021
RMP2: A Structured Composable Policy Class for Robot Learning
RSS 2021
[RETRACTED] WeaSuL: Weakly Supervised Dialogue Policy Learning: Reward Estimation for Multi-turn Dialogue
IJCNLP 2021
Turn-Level User Satisfaction Estimation in E-commerce Customer Service
IJCNLP 2021
Dr Jekyll & Mr Hyde: the strange case of off-policy policy updates
NIPS 2021
Robust Imitation Learning from Noisy Demonstrations
AISTATS 2021
Understanding the Effect of Stochasticity in Policy Optimization
NIPS 2021
<
1
…
47
48
49
…
83
>