Marcello Restelli
89 papers · 2007–2025 · 9 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+19 more ↓ Show less ↑
π§ Keyword Pioneer π£ Hot Topic Early Bird πΊοΈ Taxonomy Completionist (22) π Interdisciplinary Bridge π Conference Polyglot (9)
π
Interdisciplinary Bridge
π£
Hot Topic Early Bird
π§
Keyword Pioneer
π
Keyword Trendsetter Combo
(5)
π
Conference Loyalist
(24)
π§¬
Topic Evolution
π€
Dynamic Duo
(42)
π
Grand Slam
π₯
Mega-Team
(22)
π
Triple Crown
π¬
Deep Specialist
(40)
π
Keyword Champion
(4)
β‘
Prolific Year
(13)
ποΈ
Keyword Collector
(86)
π
Trend Setter
π
Century Club
(89)
π₯
Unstoppable
(10)
β
The Questioner
π
Conference Pioneer
Conferences
ICML (26)
NIPS (24)
AAAI (14)
AISTATS (9)
JMLR (6)
ICLR (3)
IJCAI (3)
UAI (3)
COLT (1)
Top co-authors
Research topics
Keywords
reinforcement learning
(23)
regret bound
(13)
policy gradient
(11)
policy optimization
(9)
online learning
(9)
sample complexity
(9)
importance sampling
(8)
markov decision process
(8)
inverse reinforcement learning
(7)
multi-armed bandit
(6)
transfer learning
(6)
reward function
(6)
regret minimization
(5)
policy learning
(5)
imitation learning
(5)
function approximation
(4)
off-policy learning
(4)
continuous control
(4)
sequential decision-making
(4)
online algorithm
(4)
Papers
Enhancing Diversity In Parallel Agents: A Maximum State Entropy Exploration Story
ICML 2025
Achieving $\widetilde\mathcalO(\sqrtT)$ Regret in Average-Reward POMDPs with Known Observation Models
AISTATS 2025
Efficient Exploitation of Hierarchical Structure in Sparse Reward Reinforcement Learning
AISTATS 2025
Sub-optimal Experts mitigate Ambiguity in Inverse Reinforcement Learning
NIPS 2024
Optimal Multi-Fidelity Best-Arm Identification
NIPS 2024
Projection by Convolution: Optimal Sample Complexity for Reinforcement Learning in Continuous-Space MDPs
COLT 2024
Autoregressive Bandits
AISTATS 2024
Parameterized Projected Bellman Operator
AAAI 2024
How to Explore with Belief: State Entropy Maximization in POMDPs
ICML 2024
Best Arm Identification for Stochastic Rising Bandits
ICML 2024
Exploiting Causal Graph Priors with Posterior Sampling for Reinforcement Learning
ICLR 2024
Factored-Reward Bandits with Intermediate Observations
ICML 2024
Information Capacity Regret Bounds for Bandits with Mediator Feedback
JMLR 2024
No-Regret Reinforcement Learning in Smooth MDPs
ICML 2024
Graph-Triggered Rising Bandits
ICML 2024
A Retrospective on the Robot Air Hockey Challenge: Benchmarking Robust, Reliable, and Safe Learning Techniques for Real-world Robotics
NIPS 2024
Online Markov Decision Processes Configuration with Continuous Decision Space
AAAI 2024
Local Linearity: the Key for No-regret Reinforcement Learning in Continuous MDPs
NIPS 2024
Bandits with Ranking Feedback
NIPS 2024
Wasserstein Actor-Critic: Directed Exploration via Optimism for Continuous-Actions Control
AAAI 2023
Provably Efficient Causal Model-Based Reinforcement Learning for Systematic Generalization
AAAI 2023
Simultaneously Updating All Persistence Values in Reinforcement Learning
AAAI 2023
Dynamic Pricing with Volume Discounts in Online Settings
AAAI 2023
Truncating Trajectories in Monte Carlo Reinforcement Learning
ICML 2023
Dynamical Linear Bandits
ICML 2023
Towards Theoretical Understanding of Inverse Reinforcement Learning
ICML 2023
On the Relation between Policy Improvement and Off-Policy Minimum-Variance Policy Evaluation
UAI 2023
Convex Reinforcement Learning in Finite Trials
JMLR 2023
Truncating Trajectories in Monte Carlo Policy Evaluation: an Adaptive Approach
NIPS 2023
Distributional Policy Evaluation: a Maximum Entropy approach to Representation Learning
NIPS 2023
A Tale of Sampling and Estimation in Discounted Reinforcement Learning
AISTATS 2023
Tight Performance Guarantees of Imitator Policies with Continuous Actions
AAAI 2023
Stochastic Rising Bandits
ICML 2022
Challenging Common Assumptions in Convex Reinforcement Learning
NIPS 2022
Multi-Fidelity Best-Arm Identification
NIPS 2022
Off-Policy Evaluation with Deficient Support Using Side Information
NIPS 2022
Lifelong Hyper-Policy Optimization with Multiple Importance Sampling Regularization
AAAI 2022
Unsupervised Reinforcement Learning in Multiple Environments
AAAI 2022
Reward-Free Policy Space Compression for Reinforcement Learning
AISTATS 2022
Finite Sample Analysis of Mean-Volatility Actor-Critic for Risk-Averse Reinforcement Learning
AISTATS 2022
Goal-Directed Planning via Hindsight Experience Replay
ICLR 2022
Balancing Sample Efficiency and Suboptimality in Inverse Reinforcement Learning
ICML 2022
Delayed Reinforcement Learning by Imitation
ICML 2022
The Importance of Non-Markovianity in Maximum State Entropy Exploration
ICML 2022
Multi-Armed Bandit Problem with Temporally-Partitioned Rewards: When Partial Feedback Counts
IJCAI 2022
Learning in Markov games: Can we exploit a general-sum opponent?
UAI 2022
Meta-Reinforcement Learning by Tracking Task Non-stationarity
IJCAI 2021
Gaussian Approximation for Bias Reduction in Q-Learning
JMLR 2021
Learning in Non-Cooperative Configurable Markov Decision Processes
NIPS 2021
Provably Efficient Learning of Transferable Rewards
ICML 2021
Leveraging Good Representations in Linear Contextual Bandits
ICML 2021
Time-variant variational transfer for value functions
UAI 2021
Reinforcement Learning in Linear MDPs: Constant Regret and Representation Selection
NIPS 2021
Subgaussian and Differentiable Importance Sampling for Off-Policy Evaluation and Learning
NIPS 2021
Safe Policy Iteration: A Monotonically Improving Approximate Policy Iteration Approach
JMLR 2021
MushroomRL: Simplifying Reinforcement Learning Research
JMLR 2021
Newton Optimization on Helmholtz Decomposition for Continuous Games
AAAI 2021
Task-Agnostic Exploration via Policy Gradient of a Non-Parametric State Entropy Estimate
AAAI 2021
Policy Optimization as Online Learning with Mediator Feedback
AAAI 2021
An Asymptotically Optimal Primal-Dual Incremental Algorithm for Contextual Linear Bandits
NIPS 2020
Control Frequency Adaptation via Action Persistence in Batch Reinforcement Learning
ICML 2020
Sequential Transfer in Reinforcement Learning with a Generative Model
ICML 2020
An Intrinsically-Motivated Approach for Learning Highly Exploring and Fast Mixing Policies
AAAI 2020
Importance Sampling Techniques for Policy Optimization
JMLR 2020
Risk-Averse Trust Region Optimization for Reward-Volatility Reduction
IJCAI 2020
Sharing Knowledge in Multi-Task Deep Reinforcement Learning
ICLR 2020
Gradient-Aware Model-Based Policy Search
AAAI 2020
Balancing Learning Speed and Stability in Policy Gradient via Adaptive Exploration
AISTATS 2020
Truly Batch Model-Free Inverse Reinforcement Learning about Multiple Intentions
AISTATS 2020
A Novel Confidence-Based Algorithm for Structured Bandits
AISTATS 2020
Inverse Reinforcement Learning from a Gradient-based Learner
NIPS 2020
Transfer of Samples in Policy Search via Multiple Importance Sampling
ICML 2019
Propagating Uncertainty in Reinforcement Learning via Wasserstein Barycenters
NIPS 2019
Reinforcement Learning in Configurable Continuous Environments
ICML 2019
Optimistic Policy Optimization via Multiple Importance Sampling
ICML 2019
Stochastic Variance-Reduced Policy Gradient
ICML 2018
Configurable Markov Decision Processes
ICML 2018
Policy Optimization via Importance Sampling
NIPS 2018
Transfer of Value Functions via Variational Methods
NIPS 2018
Importance Weighted Transfer of Samples in Reinforcement Learning
ICML 2018
Adaptive Batch Size for Safe Policy Gradients
NIPS 2017
Compatible Reward Inverse Reinforcement Learning
NIPS 2017
Boosted Fitted Q-Iteration
ICML 2017
Estimating Maximum Expected Value through Gaussian Approximation
ICML 2016
Sparse Multi-Task Reinforcement Learning
NIPS 2014
Safe Policy Iteration
ICML 2013
Adaptive Step-Size for Policy Gradient Methods
NIPS 2013
Transfer from Multiple MDPs
NIPS 2011
Reinforcement Learning in Continuous Action Spaces through Sequential Monte Carlo Methods
NIPS 2007