Odalric-ambrym Maillard
38 papers · 2010–2025 · 9 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+14 more ↓ Show less ↑
π§ Keyword Pioneer πΊοΈ Taxonomy Completionist (21) π Interdisciplinary Bridge π Renaissance Researcher (5) π£ Hot Topic Early Bird
π
Interdisciplinary Bridge
π
Academic Marathon
(15)
πΊοΈ
Taxonomy Completionist
(21)
πΊ
Lone Wolf
(3)
π¬
Deep Specialist
(16)
π±
Topic Pioneer
π
Keyword Champion
(2)
π
Century Club
(38)
ποΈ
Keyword Collector
(54)
π
Trend Setter
π₯
Unstoppable
(9)
π
Conference Pioneer
β‘
Prolific Year
(5)
β
The Questioner
Conferences
NIPS (16)
ALT (5)
ACML (4)
ICML (4)
AISTATS (3)
JMLR (3)
COLT (1)
ICLR (1)
UAI (1)
Top co-authors
Keywords
regret bound
(19)
multi-armed bandit
(10)
markov decision process
(10)
reinforcement learning
(8)
regret minimization
(5)
online learning
(4)
stochastic optimization
(4)
upper confidence bound
(3)
concentration inequality
(3)
state representation
(3)
bandit algorithm
(3)
sample complexity
(2)
asymptotic optimality
(2)
active learning
(2)
regret analysis
(2)
model-based reinforcement learning
(2)
change-point detection
(2)
kullback-leibler divergence
(2)
reproducing kernel hilbert space
(2)
exponential family
(2)
Papers
Monte-Carlo Tree Search with Uncertainty Propagation via Optimal Transport
ICML 2025
CRIMED: Lower and Upper Bounds on Regret for Bandits with Unbounded Stochastic Corruption
ALT 2024
Power Mean Estimation in Stochastic Monte-Carlo Tree Search
UAI 2024
Logarithmic regret in communicating MDPs: Leveraging known dynamics with bandits
ACML 2023
Fast Asymptotically Optimal Algorithms for Non-Parametric Stochastic Bandits
NIPS 2023
Exploration in Reward Machines with Low Regret
AISTATS 2023
Efficient Change-Point Detection for Tackling Piecewise-Stationary Bandits
JMLR 2022
IMED-RL: Regret optimal learning of ergodic Markov decision processes
NIPS 2022
Indexed Minimum Empirical Divergence for Unimodal Bandits
NIPS 2021
From Optimality to Robustness: Adaptive Re-Sampling Strategies in Stochastic Bandits
NIPS 2021
Stochastic bandits with groups of similar arms.
NIPS 2021
Stochastic Online Linear Regression: the Forward Algorithm to Replace Ridge
NIPS 2021
Learning Value Functions in Deep Policy Gradients using Residual Variance
ICLR 2021
Reinforcement Learning in Parametric MDPs with Exponential Families
AISTATS 2021
Sub-sampling for Efficient Non-Parametric Bandit Exploration
NIPS 2020
Monte-Carlo Graph Search: the Value of Merging Similar States
ACML 2020
Robust-Adaptive Control of Linear Systems: beyond Quadratic Costs
NIPS 2020
Learning Multiple Markov Chains via Adaptive Allocation
NIPS 2019
Regret Bounds for Learning State Representations in Reinforcement Learning
NIPS 2019
Model-Based Reinforcement Learning Exploiting State-Action Equivalence
ACML 2019
Sequential change-point detection: Laplace concentration of scan statistics and non-asymptotic delay bounds
ALT 2019
Budgeted Reinforcement Learning in Continuous State Space
NIPS 2019
Streaming kernel regression with provably adaptive mean, variance, and regularization
JMLR 2018
Variance-Aware Regret Bounds for Undiscounted Reinforcement Learning in MDPs
ALT 2018
Boundary Crossing for General Exponential Families
ALT 2017
Efficient tracking of a growing number of experts
ALT 2017
Spectral Learning from a Single Trajectory under Finite-State Policies
ICML 2017
Latent Bandits.
ICML 2014
How hard is my MDP?" The distribution-norm to the rescue"
NIPS 2014
Competing with an Infinite Set of Models in Reinforcement Learning
AISTATS 2013
Optimal Regret Bounds for Selecting the State Representation in Reinforcement Learning
ICML 2013
Hierarchical Optimistic Region Selection driven by Curiosity
NIPS 2012
Online allocation and homogeneous partitioning for piecewise constant mean-approximation
NIPS 2012
Linear Regression With Random Projections
JMLR 2012
Sparse Recovery with Brownian Sensing
NIPS 2011
A Finite-Time Analysis of Multi-armed Bandits Problems with Kullback-Leibler Divergences
COLT 2011
Selecting the State-Representation in Reinforcement Learning
NIPS 2011
Finite-sample Analysis of Bellman Residual Minimization
ACML 2010