Tom Zahavy
22 papers · 2016–2025 · 7 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+10 more ↓ Show less ↑
π Conference Polyglot (7) π Interdisciplinary Bridge π§ Keyword Pioneer π£ Hot Topic Early Bird π Academic Marathon (9)
π
Renaissance Researcher
(5)
πΊοΈ
Taxonomy Completionist
(31)
π
Conference Polyglot
(7)
π
Triple Crown
π
Grand Slam
π§¬
Topic Evolution
π
Keyword Champion
(4)
β‘
Prolific Year
(5)
π
Century Club
(22)
ποΈ
Keyword Collector
(76)
Conferences
NIPS (7)
ICLR (5)
ICML (5)
AAAI (2)
ALT (1)
MLHC (1)
UAI (1)
Top co-authors
Keywords
reinforcement learning
(4)
apprenticeship learning
(4)
deep q-network
(3)
deep reinforcement learning
(3)
markov decision process
(2)
policy optimization
(2)
atari game
(2)
hierarchical reinforcement learning
(2)
constrained mdp
(2)
representation learning
(2)
off-policy learning
(2)
bayesian regularization
(1)
policy evaluation
(1)
function approximation
(1)
unsupervised pretraining
(1)
fenchel duality
(1)
feature learning
(1)
gradient descent
(1)
convex optimization
(1)
hierarchical representation
(1)
Papers
Mastering Board Games by External and Internal Planning with Language Models
ICML 2025
ReLOAD: Reinforcement Learning with Optimistic Ascent-Descent for Last-Iterate Convergence in Constrained MDPs
ICML 2023
Discovering Evolution Strategies via Meta-Black-Box Optimization
ICLR 2023
Optimistic Meta-Gradients
NIPS 2023
Discovering Policies with DOMiNO: Diversity Optimization Maintaining Near Optimality
ICLR 2023
Bootstrapped Meta-Learning
ICLR 2022
Online Apprenticeship Learning
AAAI 2022
Palm up: Playing in the Latent Manifold for Unsupervised Pretraining
NIPS 2022
Balancing Constraints and Rewards with Meta-Gradient D4PG
ICLR 2021
Discovering a set of policies for the worst case reward
ICLR 2021
Online Limited Memory Neural-Linear Bandits with Likelihood Matching
ICML 2021
Discovery of Options via Meta-Learned Subgoals
NIPS 2021
Emphatic Algorithms for Deep Reinforcement Learning
ICML 2021
Reward is enough for convex MDPs
NIPS 2021
Unknown mixing times in apprenticeship and reinforcement learning
UAI 2020
A Self-Tuning Actor-Critic Algorithm
NIPS 2020
Apprenticeship Learning via Frank-Wolfe
AAAI 2020
Planning in Hierarchical Reinforcement Learning: Guarantees for Using Local Policies
ALT 2020
Learning to Ask Medical Questions using Reinforcement Learning
MLHC 2020
Learn What Not to Learn: Action Elimination with Deep Reinforcement Learning
NIPS 2018
Shallow Updates for Deep Reinforcement Learning
NIPS 2017
Graying the black box: Understanding DQNs
ICML 2016