Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Methods
Reinforcement Learning
›
Methods
›
Deep RL
3861 directly classified papers
Papers per year
2005: 1
2006: 9
2007: 14
2008: 15
2009: 9
2010: 21
2011: 27
2012: 32
2013: 21
2014: 17
2015: 10
2016: 33
2017: 102
2018: 222
2019: 399
2020: 450
2021: 533
2022: 478
2023: 532
2024: 513
2025: 326
2026: 97
Papers
Inference Aided Reinforcement Learning for Incentive Mechanism Design in Crowdsourcing
NIPS 2018
Context-dependent upper-confidence bounds for directed exploration
NIPS 2018
Fast deep reinforcement learning using online adjustments from the past
NIPS 2018
Scalable Bilinear Pi Learning Using State and Action Features
ICML 2018
Mix & Match Agent Curricula for Reinforcement Learning
ICML 2018
Learning Globally Optimized Object Detector via Policy Gradient
CVPR 2018
Dynamic Zoom-In Network for Fast Object Detection in Large Images
CVPR 2018
Deep Reinforcement Learning of Region Proposal Networks for Object Detection
CVPR 2018
Generative Temporal Models with Spatial Memory for Partially Observed Environments
ICML 2018
Efficient Bias-Span-Constrained Exploration-Exploitation in Reinforcement Learning
ICML 2018
Dual Policy Iteration
NIPS 2018
Scalable Coordinated Exploration in Concurrent Reinforcement Learning
NIPS 2018
Meta-Gradient Reinforcement Learning
NIPS 2018
Recurrent World Models Facilitate Policy Evolution
NIPS 2018
Q-learning with Nearest Neighbors
NIPS 2018
Near Optimal Exploration-Exploitation in Non-Communicating Markov Decision Processes
NIPS 2018
Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models
NIPS 2018
An Off-policy Policy Gradient Theorem Using Emphatic Weightings
NIPS 2018
Randomized Prior Functions for Deep Reinforcement Learning
NIPS 2018
On Oracle-Efficient PAC RL with Rich Observations
NIPS 2018
Learn What Not to Learn: Action Elimination with Deep Reinforcement Learning
NIPS 2018
Policy Optimization via Importance Sampling
NIPS 2018
Transfer of Deep Reactive Policies for MDP Planning
NIPS 2018
Unsupervised Learning based Jump-Diffusion Process for Object Tracking in Video Surveillance
IJCAI 2018
Goal-HSVI: Heuristic Search Value Iteration for Goal POMDPs
IJCAI 2018
<
1
…
140
141
142
…
155
>