Aviv Rosenberg
20 papers · 2019–2025 · 7 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+7 more ↓ Show less ↑
π Conference Polyglot (7) π£ Hot Topic Early Bird π§ Keyword Pioneer π Interdisciplinary Bridge π Academic Marathon (6)
π
Renaissance Researcher
(6)
πΊοΈ
Taxonomy Completionist
(23)
π
Conference Polyglot
(7)
π
Grand Slam
π
Century Club
(20)
π₯
Unstoppable
(7)
ποΈ
Keyword Collector
(65)
Conferences
NIPS (7)
ICML (6)
AAAI (2)
COLT (2)
ICLR (1)
IJCAI (1)
JMLR (1)
Top co-authors
Keywords
regret bound
(13)
markov decision process
(8)
reinforcement learning
(8)
policy optimization
(7)
online learning
(6)
delayed feedback
(5)
stochastic shortest path
(5)
bandit feedback
(5)
adversarial mdp
(3)
adversarial learning
(3)
regret minimization
(2)
linear bandit
(2)
optimal policy
(2)
follow the regularized leader
(2)
deep reinforcement learning
(2)
combinatorial semi-bandit
(2)
dynamic programming
(1)
value function
(1)
path planning
(1)
language model alignment
(1)
Papers
A Unified Analysis of Nonstochastic Delayed Feedback for Combinatorial Semi-Bandits, Linear Bandits, and MDPs
JMLR 2025
Building Math Agents with Multi-Turn Iterative Preference Learning
ICLR 2025
Multi-turn Reinforcement Learning with Preference Human Feedback
NIPS 2024
Near-Optimal Regret in Linear MDPs with Aggregate Bandit Feedback
ICML 2024
Warm-up Free Policy Optimization: Improved Regret in Linear Markov Decision Processes
NIPS 2024
Online Weighted Paging with Unknown Weights
NIPS 2024
A Unified Analysis of Nonstochastic Delayed Feedback for Combinatorial Semi-Bandits, Linear Bandits, and MDPs
COLT 2023
Planning and Learning with Adaptive Lookahead
AAAI 2023
Delay-Adapted Policy Optimization and Improved Regret for Adversarial MDP with Delayed Bandit Feedback
ICML 2023
Cooperative Online Learning in Stochastic and Adversarial MDPs
ICML 2022
Policy Optimization for Stochastic Shortest Path
COLT 2022
Near-Optimal Regret for Adversarial MDP with Delayed Bandit Feedback
NIPS 2022
Learning Adversarial Markov Decision Processes with Delayed Feedback
AAAI 2022
Stochastic Shortest Path with Adversarially Changing Costs
IJCAI 2021
Oracle-Efficient Regret Minimization in Factored MDPs with Unknown Structure
NIPS 2021
Minimax Regret for Stochastic Shortest Path
NIPS 2021
Optimistic Policy Optimization with Bandit Feedback
ICML 2020
Near-optimal Regret Bounds for Stochastic Shortest Path
ICML 2020
Online Convex Optimization in Adversarial Markov Decision Processes
ICML 2019
Online Stochastic Shortest Path with Bandit Feedback and Unknown Transition Function
NIPS 2019