Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Methods
Reinforcement Learning
›
Methods
›
Deep RL
3861 directly classified papers
Papers per year
2005: 1
2006: 9
2007: 14
2008: 15
2009: 9
2010: 21
2011: 27
2012: 32
2013: 21
2014: 17
2015: 10
2016: 33
2017: 102
2018: 222
2019: 399
2020: 450
2021: 533
2022: 478
2023: 532
2024: 513
2025: 326
2026: 97
Papers
Generating Long-term Trajectories Using Deep Hierarchical Networks
NIPS 2016
Reinforcement Learning for Visual Object Detection
CVPR 2016
PAC Reinforcement Learning with Rich Observations
NIPS 2016
Interactive Spoken Content Retrieval by Deep Reinforcement Learning
INTERSPEECH 2016
Learning values across many orders of magnitude
NIPS 2016
Guided Cost Learning: Deep Inverse Optimal Control via Policy Optimization
ICML 2016
Near Optimal Behavior via Approximate State Abstraction
ICML 2016
Improving PAC Exploration Using the Median Of Means
NIPS 2016
Opponent Modeling in Deep Reinforcement Learning
ICML 2016
Graying the black box: Understanding DQNs
ICML 2016
On the Rate of Convergence and Error Bounds for LSTD(λ)
ICML 2015
On TD(0) with function approximation: Concentration bounds and a centered variant with exponential convergence
ICML 2015
Approximate Dynamic Programming for Two-Player Zero-Sum Markov Games
ICML 2015
Off-policy Model-based Learning under Unknown Factored Dynamics
ICML 2015
Learning Continuous Control Policies by Stochastic Value Gradients
NIPS 2015
Variational Information Maximisation for Intrinsically Motivated Reinforcement Learning
NIPS 2015
Universal Value Function Approximators
ICML 2015
On Convergence of Emphatic Temporal-Difference Learning
COLT 2015
A Deeper Look at Planning as Learning from Replay
ICML 2015
Regularized Policy Gradients: Direct Variance Reduction in Policy Gradient Estimation
ACML 2015
Sample Efficient Reinforcement Learning with Gaussian Processes
ICML 2014
Bayes-Adaptive Simulation-based Search with Value Function Approximation
NIPS 2014
Difference of Convex Functions Programming for Reinforcement Learning
NIPS 2014
RAAM: The Benefits of Robustness in Approximating Aggregated MDPs in Reinforcement Learning
NIPS 2014
Sequential crowdsourced labeling as an epsilon-greedy exploration in a Markov Decision Process
AISTATS 2014
<
1
…
147
148
149
…
155
>