Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Methods
Reinforcement Learning
›
Methods
›
Deep RL
3861 directly classified papers
Papers per year
2005: 1
2006: 9
2007: 14
2008: 15
2009: 9
2010: 21
2011: 27
2012: 32
2013: 21
2014: 17
2015: 10
2016: 33
2017: 102
2018: 222
2019: 399
2020: 450
2021: 533
2022: 478
2023: 532
2024: 513
2025: 326
2026: 97
Papers
The Value of Reward Lookahead in Reinforcement Learning
NIPS 2024
AGR: Reinforced Causal Agent-Guided Self-explaining Rationalization
ACL 2024
Discovering Creative Behaviors through DUPLEX: Diverse Universal Features for Policy Exploration
NIPS 2024
Bit_numeval at SemEval-2024 Task 7: Enhance Numerical Sensitivity and Reasoning Completeness for Quantitative Understanding
SEMEVAL 2024
Beyond Optimism: Exploration With Partially Observable Rewards
NIPS 2024
Sample-efficient Adversarial Imitation Learning
JMLR 2024
Adaptive Exploration for Data-Efficient General Value Function Evaluations
NIPS 2024
Offline Reinforcement Learning with OOD State Correction and OOD Action Suppression
NIPS 2024
BadRL: Sparse Targeted Backdoor Attack against Reinforcement Learning
AAAI 2024
Rethinking Discount Regularization: New Interpretations, Unintended Consequences, and Solutions for Regularization in Reinforcement Learning
JMLR 2024
An Analysis of Quantile Temporal-Difference Learning
JMLR 2024
Subwords as Skills: Tokenization for Sparse-Reward Reinforcement Learning
NIPS 2024
Sample Complexity of Variance-Reduced Distributionally Robust Q-Learning
JMLR 2024
FlexPlanner: Flexible 3D Floorplanning via Deep Reinforcement Learning in Hybrid Action Space with Multi-Modality Representation
NIPS 2024
Imagine, Initialize, and Explore: An Effective Exploration Method in Multi-Agent Reinforcement Learning
AAAI 2024
Optimistic Policy Gradient in Multi-Player Markov Games with a Single Controller: Convergence beyond the Minty Property
AAAI 2024
Beyond Expected Return: Accounting for Policy Reproducibility When Evaluating Reinforcement Learning Algorithms
AAAI 2024
Deep Reinforcement Learning-based Dialogue Policy with Graph Convolutional Q-network
COLING 2024
On the Sample Complexity and Metastability of Heavy-tailed Policy Search in Continuous Control
JMLR 2024
Stress-Testing Capability Elicitation With Password-Locked Models
NIPS 2024
Model-Free Representation Learning and Exploration in Low-Rank MDPs
JMLR 2024
Improving Language Model Reasoning with Self-motivated Learning
COLING 2024
What Effects the Generalization in Visual Reinforcement Learning: Policy Consistency with Truncated Return Prediction
AAAI 2024
No Representation, No Trust: Connecting Representation, Collapse, and Trust Issues in PPO
NIPS 2024
A Surprisingly Simple Continuous-Action POMDP Solver: Lazy Cross-Entropy Search Over Policy Trees
AAAI 2024
<
1
…
28
29
30
…
155
>