Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Methods
Reinforcement Learning
›
Methods
›
Policy Learning
2068 directly classified papers
Papers per year
2002: 6
2003: 1
2004: 1
2006: 11
2007: 10
2008: 14
2009: 9
2010: 23
2011: 15
2012: 25
2013: 25
2014: 24
2015: 23
2016: 27
2017: 61
2018: 107
2019: 187
2020: 216
2021: 274
2022: 259
2023: 321
2024: 247
2025: 153
2026: 29
Papers
Learning Task-Distribution Reward Shaping with Meta-Learning
AAAI 2021
How RL Agents Behave When Their Actions Are Modified
AAAI 2021
Constrained Risk-Averse Markov Decision Processes
AAAI 2021
Synthesis of Search Heuristics for Temporal Planning via Reinforcement Learning
AAAI 2021
Automatic Curriculum Learning With Over-repetition Penalty for Dialogue Policy Learning
AAAI 2021
Solving JumpIN’ Using Zero-Dependency Reinforcement Learning (Student Abstract)
AAAI 2021
Extending Policy Shaping to Continuous State Spaces (Student Abstract)
AAAI 2021
State-Wise Adaptive Discounting from Experience (SADE): A Novel Discounting Scheme for Reinforcement Learning (Student Abstract)
AAAI 2021
Neuro-Symbolic Approaches for Text-Based Policy Learning
EMNLP 2021
Adaptive Information Seeking for Open-Domain Question Answering
EMNLP 2021
Efficient Dialogue Complementary Policy Learning via Deep Q-network Policy and Episodic Memory Policy
EMNLP 2021
Enhancing Visual Dialog Questioner with Entity-based Strategy Learning and Augmented Guesser
EMNLP 2021
Refine and Imitate: Reducing Repetition and Inconsistency in Persuasion Dialogues via Reinforcement Learning and Human Demonstration
EMNLP 2021
Learning and Analyzing Generation Order for Undirected Sequence Models
EMNLP 2021
Learning Task Sampling Policy for Multitask Learning
EMNLP 2021
Tactical Optimism and Pessimism for Deep Reinforcement Learning
NIPS 2021
Identifiability in inverse reinforcement learning
NIPS 2021
Generalized Proximal Policy Optimization with Sample Reuse
NIPS 2021
Replacing Rewards with Examples: Example-Based Policy Search via Recursive Classification
NIPS 2021
Oracle-Efficient Regret Minimization in Factored MDPs with Unknown Structure
NIPS 2021
Learning to Iteratively Solve Routing Problems with Dual-Aspect Collaborative Transformer
NIPS 2021
CO-PILOT: COllaborative Planning and reInforcement Learning On sub-Task curriculum
NIPS 2021
Learning Collaborative Policies to Solve NP-hard Routing Problems
NIPS 2021
Co-Adaptation of Algorithmic and Implementational Innovations in Inference-based Deep Reinforcement Learning
NIPS 2021
Ranking Policy Decisions
NIPS 2021
<
1
…
45
46
47
…
83
>