Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Methods
Reinforcement Learning
›
Methods
›
Policy Learning
2068 directly classified papers
Papers per year
2002: 6
2003: 1
2004: 1
2006: 11
2007: 10
2008: 14
2009: 9
2010: 23
2011: 15
2012: 25
2013: 25
2014: 24
2015: 23
2016: 27
2017: 61
2018: 107
2019: 187
2020: 216
2021: 274
2022: 259
2023: 321
2024: 247
2025: 153
2026: 29
Papers
Inference-Time Policy Adapters (IPA): Tailoring Extreme-Scale LMs without Fine-tuning
EMNLP 2023
KRLS: Improving End-to-End Response Generation in Task Oriented Dialog with Reinforced Keywords Learning
EMNLP 2023
Symmetric (Optimistic) Natural Policy Gradient for Multi-Agent Learning with Parameter Convergence
AISTATS 2023
A Tighter Problem-Dependent Regret Bound for Risk-Sensitive Reinforcement Learning
AISTATS 2023
Fighting Uncertainty with Gradients: Offline Reinforcement Learning via Diffusion Score Matching
CORL 2023
Human-in-the-Loop Task and Motion Planning for Imitation Learning
CORL 2023
State-Conditioned Adversarial Subgoal Generation
AAAI 2023
Exploration in Reward Machines with Low Regret
AISTATS 2023
Mode-constrained Model-based Reinforcement Learning via Gaussian Processes
AISTATS 2023
Building Persona Consistent Dialogue Agents with Offline Reinforcement Learning
EMNLP 2023
Struct-XLM: A Structure Discovery Multilingual Language Model for Enhancing Cross-lingual Transfer through Reinforcement Learning
EMNLP 2023
Enhancing Task-oriented Dialogue Systems with Generative Post-processing Networks
EMNLP 2023
Diversify Question Generation with Retrieval-Augmented Style Transfer
EMNLP 2023
InitLight: Initial Model Generation for Traffic Signal Control Using Adversarial Inverse Reinforcement Learning
IJCAI 2023
Adaptive Barrier Smoothing for First-Order Policy Gradient with Contact Dynamics
ICML 2023
Actor-Critic Alignment for Offline-to-Online Reinforcement Learning
ICML 2023
On the Study of Curriculum Learning for Inferring Dispatching Policies on the Job Shop Scheduling
IJCAI 2023
On the Global Convergence of Risk-Averse Policy Gradient Methods with Expected Conditional Risk Measures
ICML 2023
Boosting Offline Reinforcement Learning with Action Preference Query
ICML 2023
Q-learning Decision Transformer: Leveraging Dynamic Programming for Conditional Sequence Modelling in Offline RL
ICML 2023
Regret Bounds for Markov Decision Processes with Recursive Optimized Certainty Equivalents
ICML 2023
Transferable Curricula through Difficulty Conditioned Generators
IJCAI 2023
On the Convergence of SARSA with Linear Function Approximation
ICML 2023
Spotlight News Driven Quantitative Trading Based on Trajectory Optimization
IJCAI 2023
Optimal Decision Tree Policies for Markov Decision Processes
IJCAI 2023
<
1
…
22
23
24
…
83
>