Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Learning Types
Machine Learning
›
Learning Types
›
Reinforcement Learning
2932 directly classified papers
Papers per year
2003: 1
2006: 11
2007: 18
2008: 23
2009: 14
2010: 22
2011: 24
2012: 34
2013: 26
2014: 24
2015: 14
2016: 23
2017: 79
2018: 182
2019: 255
2020: 284
2021: 333
2022: 319
2023: 315
2024: 457
2025: 419
2026: 55
Papers
Counterfactual Programming for Optimal Control
L4DC 2020
Stochastic bandits with arm-dependent delays
ICML 2020
PackIt: A Virtual Environment for Geometric Planning
ICML 2020
Machine-oriented NMT Adaptation for Zero-shot NLP tasks: Comparing the Usefulness of Close and Distant Languages
COLING 2020
Generating Persona Consistent Dialogues by Exploiting Natural Language Inference
AAAI 2020
Severity-Aware Semantic Segmentation With Reinforced Wasserstein Training
CVPR 2020
Bounded Rationality in Las Vegas: Probabilistic Finite Automata Play Multi-Armed Bandits
UAI 2020
Bayesian model predictive control: Efficient model exploration and regret bounds using posterior sampling
L4DC 2020
Learning the model-free linear quadratic regulator via random search
L4DC 2020
Toward fusion plasma scenario planning for NSTX-U using machine-learning-accelerated models
L4DC 2020
Lyceum: An efficient and scalable ecosystem for robot learning
L4DC 2020
Learning to Plan via Deep Optimistic Value Exploration
L4DC 2020
Fair Contextual Multi-Armed Bandits: Theory and Experiments
UAI 2020
On the design of consequential ranking algorithms
UAI 2020
Regret Analysis of Bandit Problems with Causal Background Knowledge
UAI 2020
Learning Behaviors with Uncertain Human Feedback
UAI 2020
Learning Intrinsic Rewards as a Bi-Level Optimization Problem
UAI 2020
Randomized Exploration for Non-Stationary Stochastic Linear Bandits
UAI 2020
Finite-sample Analysis of Greedy-GQ with Linear Function Approximation under Markovian Noise
UAI 2020
Efficient Object Detection in Large Images Using Deep Reinforcement Learning
WACV 2020
AsyncQVI: Asynchronous-Parallel Q-Value Iteration for Discounted Markov Decision Processes with Near-Optimal Sample Complexity
AISTATS 2020
Balancing Learning Speed and Stability in Policy Gradient via Adaptive Exploration
AISTATS 2020
Conservative Exploration in Reinforcement Learning
AISTATS 2020
Dynamic Knapsack Optimization Towards Efficient Multi-Channel Sequential Advertising
ICML 2020
From Importance Sampling to Doubly Robust Policy Gradient
ICML 2020
<
1
…
79
80
81
…
118
>