Odalric-ambrym Maillard

38 papers · 2010–2025 · 9 conferences · across top CS/AI conferences

Achievements

+14 more ↓

🧭 Keyword Pioneer 🗺️ Taxonomy Completionist (21) 🌉 Interdisciplinary Bridge 🌈 Renaissance Researcher (5) 🐣 Hot Topic Early Bird

🌉 Interdisciplinary Bridge 🏃 Academic Marathon (15) 🗺️ Taxonomy Completionist (21) 🐺 Lone Wolf (3) 🔬 Deep Specialist (16) 🌱 Topic Pioneer 🏆 Keyword Champion (2) 💎 Century Club (38) 🗃️ Keyword Collector (54) 📈 Trend Setter 🔥 Unstoppable (9) 🚀 Conference Pioneer ⚡ Prolific Year (5) ❓ The Questioner

Conferences

NIPS (16) ALT (5) ACML (4) ICML (4) AISTATS (3) JMLR (3) COLT (1) ICLR (1) UAI (1)

Top co-authors

Mohammad Sadegh Talebi (5) Rémi Munos (5) Fabien Pesquerel (4) Hassan SABER (3) Emilie Kaufmann (3) Daniil Ryabko (3) Edouard Leurent (3) Ronald Ortner (3) Dorian Baudry (3) Phuong Nguyen (2)

Keywords

regret bound (19) multi-armed bandit (10) markov decision process (10) reinforcement learning (8) regret minimization (5) online learning (4) stochastic optimization (4) upper confidence bound (3) concentration inequality (3) state representation (3) bandit algorithm (3) sample complexity (2) asymptotic optimality (2) active learning (2) regret analysis (2) model-based reinforcement learning (2) change-point detection (2) kullback-leibler divergence (2) reproducing kernel hilbert space (2) exponential family (2)

Papers

Monte-Carlo Tree Search with Uncertainty Propagation via Optimal Transport ICML 2025 CRIMED: Lower and Upper Bounds on Regret for Bandits with Unbounded Stochastic Corruption ALT 2024 Power Mean Estimation in Stochastic Monte-Carlo Tree Search UAI 2024 Logarithmic regret in communicating MDPs: Leveraging known dynamics with bandits ACML 2023 Fast Asymptotically Optimal Algorithms for Non-Parametric Stochastic Bandits NIPS 2023 Exploration in Reward Machines with Low Regret AISTATS 2023 Efficient Change-Point Detection for Tackling Piecewise-Stationary Bandits JMLR 2022 IMED-RL: Regret optimal learning of ergodic Markov decision processes NIPS 2022 Indexed Minimum Empirical Divergence for Unimodal Bandits NIPS 2021 From Optimality to Robustness: Adaptive Re-Sampling Strategies in Stochastic Bandits NIPS 2021 Stochastic bandits with groups of similar arms. NIPS 2021 Stochastic Online Linear Regression: the Forward Algorithm to Replace Ridge NIPS 2021 Learning Value Functions in Deep Policy Gradients using Residual Variance ICLR 2021 Reinforcement Learning in Parametric MDPs with Exponential Families AISTATS 2021 Sub-sampling for Efficient Non-Parametric Bandit Exploration NIPS 2020 Monte-Carlo Graph Search: the Value of Merging Similar States ACML 2020 Robust-Adaptive Control of Linear Systems: beyond Quadratic Costs NIPS 2020 Learning Multiple Markov Chains via Adaptive Allocation NIPS 2019 Regret Bounds for Learning State Representations in Reinforcement Learning NIPS 2019 Model-Based Reinforcement Learning Exploiting State-Action Equivalence ACML 2019 Sequential change-point detection: Laplace concentration of scan statistics and non-asymptotic delay bounds ALT 2019 Budgeted Reinforcement Learning in Continuous State Space NIPS 2019 Streaming kernel regression with provably adaptive mean, variance, and regularization JMLR 2018 Variance-Aware Regret Bounds for Undiscounted Reinforcement Learning in MDPs ALT 2018 Boundary Crossing for General Exponential Families ALT 2017 Efficient tracking of a growing number of experts ALT 2017 Spectral Learning from a Single Trajectory under Finite-State Policies ICML 2017 Latent Bandits. ICML 2014 How hard is my MDP?" The distribution-norm to the rescue" NIPS 2014 Competing with an Infinite Set of Models in Reinforcement Learning AISTATS 2013 Optimal Regret Bounds for Selecting the State Representation in Reinforcement Learning ICML 2013 Hierarchical Optimistic Region Selection driven by Curiosity NIPS 2012 Online allocation and homogeneous partitioning for piecewise constant mean-approximation NIPS 2012 Linear Regression With Random Projections JMLR 2012 Sparse Recovery with Brownian Sensing NIPS 2011 A Finite-Time Analysis of Multi-armed Bandits Problems with Kullback-Leibler Divergences COLT 2011 Selecting the State-Representation in Reinforcement Learning NIPS 2011 Finite-sample Analysis of Bellman Residual Minimization ACML 2010