Matteo Pirotta
44 papers · 2013–2025 · 8 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+14 more ↓ Show less ↑
π§ Keyword Pioneer π£ Hot Topic Early Bird π Interdisciplinary Bridge πΊοΈ Taxonomy Completionist (15) π Conference Polyglot (8)
πΊοΈ
Taxonomy Completionist
(15)
π
Academic Marathon
(12)
π§
Keyword Pioneer
π
Keyword Trendsetter Combo
(3)
π€
Dynamic Duo
(30)
π
Keyword Champion
π
Grand Slam
π¬
Deep Specialist
(15)
π₯
Unstoppable
(9)
π
Trend Setter
π
Conference Pioneer
β‘
Prolific Year
(10)
ποΈ
Keyword Collector
(147)
π
Century Club
(44)
Conferences
NIPS (15)
ICML (11)
AISTATS (7)
ICLR (4)
ALT (3)
JMLR (2)
AAAI (1)
UAI (1)
Top co-authors
Research topics
Keywords
regret bound
(15)
markov decision process
(13)
reinforcement learning
(11)
sample complexity
(6)
contextual bandit
(5)
regret minimization
(5)
policy gradient
(4)
representation learning
(4)
optimistic algorithm
(3)
function approximation
(3)
online learning
(3)
contextual linear bandit
(3)
stochastic shortest path
(3)
linear bandit
(2)
step size optimization
(2)
policy iteration
(2)
value iteration
(2)
autonomous exploration
(2)
exploration-exploitation tradeoff
(2)
multi-armed bandit
(2)
Papers
Temporal Difference Flows
ICML 2025
Zero-Shot Whole-Body Humanoid Control via Behavioral Foundation Models
ICLR 2025
Simple Ingredients for Offline Reinforcement Learning
ICML 2024
Fast Imitation via Behavior Foundation Models
ICLR 2024
Contextual bandits with concave rewards, and an application to fair ranking
ICLR 2023
Reaching Goals is Hard: Settling the Sample Complexity of the Stochastic Shortest Path
ALT 2023
Layered State Discovery for Incremental Autonomous Exploration
ICML 2023
On the Complexity of Representation Learning in Contextual Linear Bandits
AISTATS 2023
A Reduction-Based Framework for Conservative Bandits and Reinforcement Learning
ICLR 2022
Encrypted Linear Contextual Bandit
AISTATS 2022
Top K Ranking for Multi-Armed Bandit with Noisy Evaluations
AISTATS 2022
Adaptive Multi-Goal Exploration
AISTATS 2022
Privacy Amplification via Shuffling for Linear Contextual Bandits
ALT 2022
Scalable Representation Learning in Linear Contextual Bandits with Constant Regret Guarantees
NIPS 2022
Sample Complexity Bounds for Stochastic Shortest Path with a Generative Model
ALT 2021
Kernel-Based Reinforcement Learning: A Finite-Time Analysis
ICML 2021
Gaussian Approximation for Bias Reduction in Q-Learning
JMLR 2021
Safe Policy Iteration: A Monotonically Improving Approximate Policy Iteration Approach
JMLR 2021
Stochastic Shortest Path: Minimax, Parameter-Free and Towards Horizon-Free Regret
NIPS 2021
A Provably Efficient Sample Collection Strategy for Reinforcement Learning
NIPS 2021
Local Differential Privacy for Regret Minimization in Reinforcement Learning
NIPS 2021
Reinforcement Learning in Linear MDPs: Constant Regret and Representation Selection
NIPS 2021
Leveraging Good Representations in Linear Contextual Bandits
ICML 2021
A Kernel-Based Approach to Non-Stationary Reinforcement Learning in Metric Spaces
AISTATS 2021
Active Model Estimation in Markov Decision Processes
UAI 2020
An Asymptotically Optimal Primal-Dual Incremental Algorithm for Contextual Linear Bandits
NIPS 2020
Improved Sample Complexity for Incremental Autonomous Exploration in MDPs
NIPS 2020
Adversarial Attacks on Linear Contextual Bandits
NIPS 2020
Improved Algorithms for Conservative Exploration in Bandits
AAAI 2020
Conservative Exploration in Reinforcement Learning
AISTATS 2020
Frequentist Regret Bounds for Randomized Least-Squares Value Iteration
AISTATS 2020
No-Regret Exploration in Goal-Oriented Reinforcement Learning
ICML 2020
Regret Bounds for Learning State Representations in Reinforcement Learning
NIPS 2019
Exploration Bonus for Regret Minimization in Discrete and Continuous Average Reward MDPs
NIPS 2019
Importance Weighted Transfer of Samples in Reinforcement Learning
ICML 2018
Efficient Bias-Span-Constrained Exploration-Exploitation in Reinforcement Learning
ICML 2018
Stochastic Variance-Reduced Policy Gradient
ICML 2018
Near Optimal Exploration-Exploitation in Non-Communicating Markov Decision Processes
NIPS 2018
Boosted Fitted Q-Iteration
ICML 2017
Adaptive Batch Size for Safe Policy Gradients
NIPS 2017
Compatible Reward Inverse Reinforcement Learning
NIPS 2017
Regret Minimization in MDPs with Options without Prior Knowledge
NIPS 2017
Adaptive Step-Size for Policy Gradient Methods
NIPS 2013
Safe Policy Iteration
ICML 2013