Shangtong Zhang
27 papers · 2019–2026 · 6 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+13 more ↓ Show less ↑
π Cross-Pollinator (13) π Conference Polyglot (6) π§ Keyword Pioneer π Academic Marathon (6) π Interdisciplinary Bridge
π
Interdisciplinary Bridge
πΊοΈ
Taxonomy Completionist
(24)
π€
Dynamic Duo
(10)
π
Keyword Champion
(3)
π±
Topic Pioneer
π¬
Deep Specialist
(10)
π
Grand Slam
π§¬
Topic Evolution
π₯
Unstoppable
(7)
π
Conference Pioneer
ποΈ
Keyword Collector
(101)
π
Century Club
(26)
β‘
Prolific Year
(8)
Conferences
AAAI (8)
ICML (7)
ICLR (4)
JMLR (3)
NIPS (3)
IJCAI (2)
Top co-authors
Keywords
reinforcement learning
(8)
off-policy learning
(5)
function approximation
(5)
temporal difference learning
(4)
stochastic approximation
(3)
deep reinforcement learning
(3)
finite sample analysis
(3)
off-policy evaluation
(2)
representation learning
(2)
deadly triad
(2)
target network
(2)
markovian noise
(2)
option framework
(2)
variance reduction
(2)
off-policy reinforcement learning
(2)
off-policy actor-critic
(2)
interpretable machine learning
(1)
convergence analysis
(1)
hierarchical learning
(1)
policy evaluation
(1)
Papers
Asymptotic and Finite Sample Analysis of Nonexpansive Stochastic Approximations with Markovian Noise
AAAI 2026
Revisiting a Design Choice in Gradient Temporal Difference Learning
ICLR 2025
Efficient Multi-Policy Evaluation for Reinforcement Learning
AAAI 2025
Efficient Policy Evaluation with Safety Constraint for Reinforcement Learning
ICLR 2025
Transformers Can Learn Temporal Difference Methods for In-Context Reinforcement Learning
ICLR 2025
Doubly Optimal Policy Evaluation for Reinforcement Learning
ICLR 2025
Linear $Q$-Learning Does Not Diverge in $L^2$: Convergence Rates to a Bounded Set
ICML 2025
Counterfactual Explanations for Continuous Action Reinforcement Learning
IJCAI 2025
The ODE Method for Stochastic Approximation and Reinforcement Learning with Markovian Noise
JMLR 2025
Efficient Policy Evaluation with Offline Data Informed Behavior Policy Design
ICML 2024
A New Challenge in Policy Evaluation
AAAI 2023
On the Convergence of SARSA with Linear Function Approximation
ICML 2023
Global Optimality and Finite Sample Analysis of Softmax Off-Policy Actor Critic under State Distribution Mismatch
JMLR 2022
Learning Expected Emphatic Traces for Deep RL
AAAI 2022
Truncated Emphatic Temporal Difference Methods for Prediction and Control
JMLR 2022
Deep Residual Reinforcement Learning (Extended Abstract)
IJCAI 2021
Average-Reward Off-Policy Policy Evaluation with Function Approximation
ICML 2021
Breaking the Deadly Triad with a Target Network
ICML 2021
Mean-Variance Policy Iteration for Risk-Averse Reinforcement Learning
AAAI 2021
Provably Convergent Two-Timescale Off-Policy Actor-Critic with Function Approximation
ICML 2020
Learning Retrospective Knowledge with Reverse Reinforcement Learning
NIPS 2020
Mega-Reward: Achieving Human-Level Play without Extrinsic Rewards
AAAI 2020
GradientDICE: Rethinking Generalized Offline Estimation of Stationary Values
ICML 2020
Generalized Off-Policy Actor-Critic
NIPS 2019
DAC: The Double Actor-Critic Architecture for Learning Options
NIPS 2019
ACE: An Actor Ensemble Algorithm for Continuous Control with Tree Search
AAAI 2019
QUOTA: The Quantile Option Architecture for Reinforcement Learning
AAAI 2019