Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Methods
Reinforcement Learning
›
Methods
›
Policy Learning
2068 directly classified papers
Papers per year
2002: 6
2003: 1
2004: 1
2006: 11
2007: 10
2008: 14
2009: 9
2010: 23
2011: 15
2012: 25
2013: 25
2014: 24
2015: 23
2016: 27
2017: 61
2018: 107
2019: 187
2020: 216
2021: 274
2022: 259
2023: 321
2024: 247
2025: 153
2026: 29
Papers
Model-Based Episodic Memory Induces Dynamic Hybrid Controls
NIPS 2021
Stabilizing Dynamical Systems via Policy Gradient Methods
NIPS 2021
MobILE: Model-Based Imitation Learning From Observation Alone
NIPS 2021
Robust Predictable Control
NIPS 2021
Policy Finetuning: Bridging Sample-Efficient Offline and Online Reinforcement Learning
NIPS 2021
Is Bang-Bang Control All You Need? Solving Continuous Control with Bernoulli Policies
NIPS 2021
Coordinated Proximal Policy Optimization
NIPS 2021
Robust Inverse Reinforcement Learning under Transition Dynamics Mismatch
NIPS 2021
Navigating to the Best Policy in Markov Decision Processes
NIPS 2021
Learning Barrier Certificates: Towards Safe Reinforcement Learning with Zero Training-time Violations
NIPS 2021
Distributionally Robust Imitation Learning
NIPS 2021
Policy Optimization in Adversarial MDPs: Improved Exploration via Dilated Bonuses
NIPS 2021
Counterexample Guided RL Policy Refinement Using Bayesian Optimization
NIPS 2021
Average-Reward Learning and Planning with Options
NIPS 2021
Twice regularized MDPs and the equivalence between robustness and regularization
NIPS 2021
Translation-based Supervision for Policy Generation in Simultaneous Neural Machine Translation
EMNLP 2021
A Collaborative Multi-agent Reinforcement Learning Framework for Dialog Action Decomposition
EMNLP 2021
Generalization in Text-based Games via Hierarchical Reinforcement Learning
EMNLP 2021
Mapping Language to Programs using Multiple Reward Components with Inverse Reinforcement Learning
EMNLP 2021
Hierarchical Reinforcement Learning with Timed Subgoals
NIPS 2021
Accelerating Quadratic Optimization with Reinforcement Learning
NIPS 2021
Faster Policy Learning with Continuous-Time Gradients
L4DC 2021
Reinforce-Aligner: Reinforcement Alignment Search for Robust End-to-End Text-to-Speech
INTERSPEECH 2021
Generating Self-Contained and Summary-Centric Question Answer Pairs via Differentiable Reward Imitation Learning
EMNLP 2021
Safe Policy Optimization with Local Generalized Linear Function Approximations
NIPS 2021
<
1
…
42
43
44
…
83
>