Tom Zahavy

22 papers · 2016–2025 · 7 conferences · across top CS/AI conferences

Achievements

+10 more ↓

🌍 Conference Polyglot (7) 🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🐣 Hot Topic Early Bird 🏃 Academic Marathon (9)

🌈 Renaissance Researcher (5) 🗺️ Taxonomy Completionist (31) 🌍 Conference Polyglot (7) 👑 Triple Crown 🏆 Grand Slam 🧬 Topic Evolution 🏆 Keyword Champion (4) ⚡ Prolific Year (5) 💎 Century Club (22) 🗃️ Keyword Collector (76)

Conferences

NIPS (7) ICLR (5) ICML (5) AAAI (2) ALT (1) MLHC (1) UAI (1)

Top co-authors

Satinder Singh (6) Satinder P. Singh (5) Shie Mannor (5) Sebastian Flennerhag (5) Daniel J Mankowitz (4) Zhongwen Xu (4) Hado P van Hasselt (3) Brendan O'Donoghue (3) David Silver (3) Vivek Veeriah (3)

Keywords

reinforcement learning (4) apprenticeship learning (4) deep q-network (3) deep reinforcement learning (3) markov decision process (2) policy optimization (2) atari game (2) hierarchical reinforcement learning (2) constrained mdp (2) representation learning (2) off-policy learning (2) bayesian regularization (1) policy evaluation (1) function approximation (1) unsupervised pretraining (1) fenchel duality (1) feature learning (1) gradient descent (1) convex optimization (1) hierarchical representation (1)

Papers

Mastering Board Games by External and Internal Planning with Language Models ICML 2025 ReLOAD: Reinforcement Learning with Optimistic Ascent-Descent for Last-Iterate Convergence in Constrained MDPs ICML 2023 Discovering Evolution Strategies via Meta-Black-Box Optimization ICLR 2023 Optimistic Meta-Gradients NIPS 2023 Discovering Policies with DOMiNO: Diversity Optimization Maintaining Near Optimality ICLR 2023 Bootstrapped Meta-Learning ICLR 2022 Online Apprenticeship Learning AAAI 2022 Palm up: Playing in the Latent Manifold for Unsupervised Pretraining NIPS 2022 Balancing Constraints and Rewards with Meta-Gradient D4PG ICLR 2021 Discovering a set of policies for the worst case reward ICLR 2021 Online Limited Memory Neural-Linear Bandits with Likelihood Matching ICML 2021 Discovery of Options via Meta-Learned Subgoals NIPS 2021 Emphatic Algorithms for Deep Reinforcement Learning ICML 2021 Reward is enough for convex MDPs NIPS 2021 Unknown mixing times in apprenticeship and reinforcement learning UAI 2020 A Self-Tuning Actor-Critic Algorithm NIPS 2020 Apprenticeship Learning via Frank-Wolfe AAAI 2020 Planning in Hierarchical Reinforcement Learning: Guarantees for Using Local Policies ALT 2020 Learning to Ask Medical Questions using Reinforcement Learning MLHC 2020 Learn What Not to Learn: Action Elimination with Deep Reinforcement Learning NIPS 2018 Shallow Updates for Deep Reinforcement Learning NIPS 2017 Graying the black box: Understanding DQNs ICML 2016