Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Methods
Reinforcement Learning
›
Methods
›
Policy Learning
2068 directly classified papers
Papers per year
2002: 6
2003: 1
2004: 1
2006: 11
2007: 10
2008: 14
2009: 9
2010: 23
2011: 15
2012: 25
2013: 25
2014: 24
2015: 23
2016: 27
2017: 61
2018: 107
2019: 187
2020: 216
2021: 274
2022: 259
2023: 321
2024: 247
2025: 153
2026: 29
Papers
Warm-Start Actor-Critic: From Approximation Error to Sub-optimality Gap
ICML 2023
Task-Oriented Koopman-Based Control with Contrastive Encoder
CORL 2023
Wasserstein Actor-Critic: Directed Exploration via Optimism for Continuous-Actions Control
AAAI 2023
Policy-Based Primal-Dual Methods for Convex Constrained Markov Decision Processes
AAAI 2023
trlX: A Framework for Large Scale Open Source RLHF
EMNLP 2023
Learning Pessimism for Reinforcement Learning
AAAI 2023
Automatic Unit Test Data Generation and Actor-Critic Reinforcement Learning for Code Synthesis
EMNLP 2023
The Sufficiency of Off-Policyness and Soft Clipping: PPO Is Still Insufficient according to an Off-Policy Measure
AAAI 2023
A Minimal Approach for Natural Language Action Space in Text-based Games
EMNLP 2023
Global Convergence of Two-Timescale Actor-Critic for Solving Linear Quadratic Regulator
AAAI 2023
Harnessing the Plug-and-Play Controller by Prompting
EMNLP 2023
Policy Learning for Active Target Tracking over Continuous $SE(3)$ Trajectories
L4DC 2023
Learning to Discern: Imitating Heterogeneous Human Demonstrations with Preference and Representation Learning
CORL 2023
Enhancing Language Model with Unit Test Techniques for Efficient Regular Expression Generation
EMNLP 2023
Soft Action Priors: Towards Robust Policy Transfer
AAAI 2023
Exploiting Multiple Abstractions in Episodic RL via Reward Shaping
AAAI 2023
Augmented Proximal Policy Optimization for Safe Reinforcement Learning
AAAI 2023
Adaptive Policy Learning for Offline-to-Online Reinforcement Learning
AAAI 2023
On the Convergence of SARSA with Linear Function Approximation
ICML 2023
Policy Gradient Play with Networked Agents in Markov Potential Games
L4DC 2023
Adaptive Barrier Smoothing for First-Order Policy Gradient with Contact Dynamics
ICML 2023
Actor-Critic Alignment for Offline-to-Online Reinforcement Learning
ICML 2023
Learning to Describe for Predicting Zero-shot Drug-Drug Interactions
EMNLP 2023
On the Global Convergence of Risk-Averse Policy Gradient Methods with Expected Conditional Risk Measures
ICML 2023
Boosting Offline Reinforcement Learning with Action Preference Query
ICML 2023
<
1
…
23
24
25
…
83
>