Richard S. Sutton
17 papers · 2006–2025 · 5 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+11 more ↓ Show less ↑
π§ Keyword Pioneer π Interdisciplinary Bridge π Renaissance Researcher (5) πΊοΈ Taxonomy Completionist (12) π£ Hot Topic Early Bird
π
Renaissance Researcher
(5)
πΊοΈ
Taxonomy Completionist
(12)
π§
Keyword Pioneer
π
Keyword Trendsetter Combo
(6)
π±
Topic Pioneer
π¬
Deep Specialist
(12)
π
Keyword Champion
(3)
ποΈ
Keyword Collector
(66)
π
Trend Setter
π
Century Club
(17)
π
Conference Pioneer
Conferences
NIPS (9)
JMLR (4)
ICML (2)
AAAI (1)
IJCAI (1)
Top co-authors
Research topics
Keywords
reinforcement learning
(11)
value function
(7)
temporal-difference learning
(5)
temporal difference learning
(5)
policy evaluation
(5)
function approximation
(5)
off-policy learning
(4)
markov decision process
(3)
eligibility trace
(3)
linear function approximation
(3)
stochastic gradient descent
(3)
model-based reinforcement learning
(3)
convergence analysis
(2)
bellman error
(2)
temporal abstraction
(2)
linear function approximator
(1)
natural gradient
(1)
computational neuroscience
(1)
deep learning
(1)
online learning
(1)
Papers
MetaOptimize: A Framework for Optimizing Step Sizes and Other Meta-parameters
ICML 2025
Reward-Respecting Subtasks for Model-Based Reinforcement Learning (Abstract Reprint)
AAAI 2024
Scalable Real-Time Recurrent Learning Using Columnar-Constructive Networks
JMLR 2023
Toward Efficient Gradient-Based Value Estimation
ICML 2023
Doubly-Asynchronous Value Iteration: Making Value Iteration Asynchronous in Actions
NIPS 2022
Planning with Expectation Models
IJCAI 2019
On Generalized Bellman Equations and Temporal-Difference Learning
JMLR 2018
An Emphatic Approach to the Problem of Off-policy Temporal-Difference Learning
JMLR 2016
True Online Temporal-Difference Learning
JMLR 2016
Weighted importance sampling for off-policy learning with linear function approximation
NIPS 2014
Universal Option Models
NIPS 2014
Convergent Temporal-Difference Learning with Arbitrary Smooth Function Approximation
NIPS 2009
Multi-Step Dyna Planning for Policy Evaluation and Control
NIPS 2009
A computational model of hippocampal function in trace conditioning
NIPS 2008
A Convergent $O(n)$ Temporal-difference Algorithm for Off-policy Learning with Linear Function Approximation
NIPS 2008
Incremental Natural Actor-Critic Algorithms
NIPS 2007
iLSTD: Eligibility Traces and Convergence Analysis
NIPS 2006