Yonathan Efroni
33 papers · 2018–2026 · 8 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+14 more ↓ Show less ↑
π£ Hot Topic Early Bird π Conference Polyglot (7) π§ Keyword Pioneer π Interdisciplinary Bridge π Academic Marathon (7)
πΊοΈ
Taxonomy Completionist
(34)
π§
Keyword Pioneer
π£
Hot Topic Early Bird
π€
Dynamic Duo
(19)
π
Triple Crown
π
Keyword Champion
(2)
π
Grand Slam
π¬
Deep Specialist
(10)
π§¬
Topic Evolution
π
Conference Pioneer
β‘
Prolific Year
(6)
ποΈ
Keyword Collector
(107)
π₯
Unstoppable
(8)
π
Century Club
(32)
Conferences
ICML (14)
NIPS (8)
ICLR (4)
AAAI (3)
COLT (1)
EACL (1)
JMLR (1)
UAI (1)
Top co-authors
Keywords
reinforcement learning
(12)
regret bound
(6)
online learning
(5)
policy learning
(5)
contextual bandit
(4)
sample complexity
(4)
dynamic programming
(4)
policy optimization
(3)
policy iteration
(3)
markov decision process
(3)
optimal policy
(3)
multi-task learning
(2)
greedy policy
(2)
partial observability
(2)
policy improvement
(2)
multi-armed bandit
(2)
value iteration
(2)
model-based reinforcement learning
(2)
deep reinforcement learning
(2)
adversarial robustness
(2)
Papers
Imbalanced Gradients in RL Post-Training of Multi-Task LLMs
EACL 2026
Time After Time: Deep-Q Effect Estimation for Interventions on When and What to do
ICLR 2025
Aligned Multi Objective Optimization
ICML 2025
Exploiting Structure in Offline Multi-Agent RL: The Benefits of Low Interaction Rank
ICLR 2025
Pearl: A Production-Ready Reinforcement Learning Agent
JMLR 2024
RL in Latent MDPs is Tractable: Online Guarantees via Off-Policy Evaluation
NIPS 2024
PcLast: Discovering Plannable Continuous Latent States
ICML 2024
Prospective Side Information for Latent MDPs
ICML 2024
Principled Offline RL in the Presence of Rich Exogenous Information
ICML 2023
Reward-Mixing MDPs with Few Latent Contexts are Learnable
ICML 2023
Sample-Efficient Reinforcement Learning in the Presence of Exogenous Information
COLT 2022
Mirror Descent Policy Optimization
ICLR 2022
Provably Filtering Exogenous Distractors using Multistep Inverse Dynamics
ICLR 2022
Sparsity in Partially Controllable Linear Systems
ICML 2022
Coordinated Attacks against Contextual Bandits: Fundamental Limits and Defense Mechanisms
ICML 2022
Tractable Optimality in Episodic Latent MABs
NIPS 2022
Provable Reinforcement Learning with a Short-Term Memory
ICML 2022
Confidence-Budget Matching for Sequential Budgeted Learning
ICML 2021
Minimax Regret for Stochastic Shortest Path
NIPS 2021
RL for Latent MDPs: Regret Guarantees and a Lower Bound
NIPS 2021
Reinforcement Learning with Trajectory Feedback
AAAI 2021
Reinforcement Learning in Reward-Mixing MDPs
NIPS 2021
Bandits with partially observable confounded data
UAI 2021
Optimistic Policy Optimization with Bandit Feedback
ICML 2020
Adaptive Trust Region Policy Optimization: Global Convergence and Faster Rates for Regularized MDPs
AAAI 2020
Multi-step Greedy Reinforcement Learning Algorithms
ICML 2020
Online Planning with Lookahead Policies
NIPS 2020
Tight Regret Bounds for Model-Based Reinforcement Learning with Greedy Policies
NIPS 2019
Action Robust Reinforcement Learning and Applications in Continuous Control
ICML 2019
Exploration Conscious Reinforcement Learning Revisited
ICML 2019
How to Combine Tree-Search Methods in Reinforcement Learning
AAAI 2019
Beyond the One-Step Greedy Approach in Reinforcement Learning
ICML 2018
Multiple-Step Greedy Policies in Approximate and Online Reinforcement Learning
NIPS 2018