Michal Valko
101 papers · 2010–2025 · 10 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+14 more ↓ Show less ↑
πΊοΈ Taxonomy Completionist (31) π§ Keyword Pioneer π Interdisciplinary Bridge π Renaissance Researcher (5) π£ Hot Topic Early Bird
π
Cross-Pollinator
(9)
π
Interdisciplinary Bridge
π
Conference Polyglot
(10)
π
Conference Loyalist
(27)
π
Keyword Champion
(2)
π
Triple Crown
π¬
Deep Specialist
(10)
π€
Dynamic Duo
(35)
ποΈ
Keyword Collector
(107)
π
Trend Setter
π
Conference Pioneer
π₯
Unstoppable
(13)
β‘
Prolific Year
(19)
π
Century Club
(101)
Conferences
ICML (41)
NIPS (27)
AISTATS (16)
ALT (5)
COLT (3)
ICLR (3)
JMLR (3)
ECCV (1)
ICCV (1)
IJCAI (1)
Top co-authors
Research topics
Keywords
regret bound
(29)
multi-armed bandit
(18)
online learning
(16)
sample complexity
(14)
reinforcement learning
(11)
markov decision process
(10)
stochastic optimization
(8)
online algorithm
(6)
regret minimization
(6)
value function
(5)
upper confidence bound
(5)
determinantal point process
(5)
policy optimization
(5)
self-supervised learning
(5)
global optimization
(4)
gaussian process
(4)
deep reinforcement learning
(4)
active learning
(4)
cumulative regret
(4)
value iteration
(3)
Papers
The Harder Path: Last Iterate Convergence for Uncoupled Learning in Zero-Sum Games with Bandit Feedback
ICML 2025
Metacognitive Capabilities of LLMs: An Exploration in Mathematical Problem Solving
NIPS 2024
Unlocking the Power of Representations in Long-term Novelty-based Exploration
ICLR 2024
Demonstration-Regularized RL
ICLR 2024
A General Theoretical Paradigm to Understand Learning from Human Preferences
AISTATS 2024
Generalized Preference Optimization: A Unified Approach to Offline Alignment
ICML 2024
Nash Learning from Human Feedback
ICML 2024
Decoding-time Realignment of Language Models
ICML 2024
Human Alignment of Large Language Models through Online Preference Optimisation
ICML 2024
Local and Adaptive Mirror Descents in Extensive-Form Games
NIPS 2024
Quantile Credit Assignment
ICML 2023
Regularization and Variance-Weighted Regression Achieves Minimax Optimality in Linear MDPs: Theory and Practice
ICML 2023
DoMo-AC: Doubly Multi-step Off-policy Actor-Critic Algorithm
ICML 2023
VA-learning as a more efficient alternative to Q-learning
ICML 2023
Fast Rates for Maximum Entropy Exploration
ICML 2023
Model-free Posterior Sampling via Learning Rate Randomization
NIPS 2023
Understanding Self-Predictive Learning for Reinforcement Learning
ICML 2023
Adapting to game trees in zero-sum imperfect information games
ICML 2023
Half-Hop: A graph upsampling approach for slowing down message passing
ICML 2023
Curiosity in Hindsight: Intrinsic Exploration in Stochastic Environments
ICML 2023
Optimistic Posterior Sampling for Reinforcement Learning with Few Samples and Tight Guarantees
NIPS 2022
Large-Scale Representation Learning on Graphs via Bootstrapping
ICLR 2022
BYOL-Explore: Exploration by Bootstrapped Prediction
NIPS 2022
From Dirichlet to Rubin: Optimistic Exploration in RL without Bonuses
ICML 2022
Retrieval-Augmented Reinforcement Learning
ICML 2022
Scaling Gaussian Process Optimization by Evaluating a Few Unique Candidates Multiple Times
ICML 2022
Marginalized Operators for Off-policy Reinforcement Learning
AISTATS 2022
Adaptive Multi-Goal Exploration
AISTATS 2022
Unifying Gradient Estimators for Meta-Reinforcement Learning via Off-Policy Evaluation
NIPS 2021
Broaden Your Views for Self-Supervised Video Learning
ICCV 2021
A Kernel-Based Approach to Non-Stationary Reinforcement Learning in Metric Spaces
AISTATS 2021
Episodic Reinforcement Learning in Finite MDPs: Minimax Lower Bounds Revisited
ALT 2021
Revisiting Pengβs Q($Ξ»$) for Modern Reinforcement Learning
ICML 2021
Taylor Expansion of Discount Factors
ICML 2021
UCB Momentum Q-learning: Correcting the bias without forgetting
ICML 2021
Kernel-Based Reinforcement Learning: A Finite-Time Analysis
ICML 2021
Sample Complexity Bounds for Stochastic Shortest Path with a Generative Model
ALT 2021
Adaptive Reward-Free Exploration
ALT 2021
Online A-Optimal Design and Active Linear Regression
ICML 2021
Fast active learning for pure exploration in reinforcement learning
ICML 2021
Learning in two-player zero-sum partially observable Markov games with perfect recall
NIPS 2021
Drop, Swap, and Generate: A Self-Supervised Approach for Generating Neural Activity
NIPS 2021
A Provably Efficient Sample Collection Strategy for Reinforcement Learning
NIPS 2021
Stochastic Shortest Path: Minimax, Parameter-Free and Towards Horizon-Free Regret
NIPS 2021
A single algorithm for both restless and rested rotting bandits
AISTATS 2020
Planning in Markov Decision Processes with Gap-Dependent Sample Complexity
NIPS 2020
Statistical Efficiency of Thompson Sampling for Combinatorial Semi-Bandits
NIPS 2020
Sampling from a k-DPP without looking at all items
NIPS 2020
Improved Sample Complexity for Incremental Autonomous Exploration in MDPs
NIPS 2020
Bootstrap Your Own Latent - A New Approach to Self-Supervised Learning
NIPS 2020
Fixed-confidence guarantees for Bayesian best-arm identification
AISTATS 2020
Derivative-Free & Order-Robust Optimisation
AISTATS 2020
Adaptive multi-fidelity optimization with fast learning rates
AISTATS 2020
Covariance-adapting algorithm for semi-bandits with application to sparse outcomes
COLT 2020
Near-linear time Gaussian process optimization with adaptive batching and resparsification
ICML 2020
Gamification of Pure Exploration for Linear Bandits
ICML 2020
Stochastic bandits with arm-dependent delays
ICML 2020
Monte-Carlo Tree Search as Regularized Policy Optimization
ICML 2020
Budgeted Online Influence Maximization
ICML 2020
Improved Sleeping Bandits with Stochastic Action Sets and Adversarial Rewards
ICML 2020
Taylor Expansion Policy Optimization
ICML 2020
No-Regret Exploration in Goal-Oriented Reinforcement Learning
ICML 2020
Spectral bandits
JMLR 2020
Active multiple matrix completion with adaptive confidence sets
AISTATS 2019
Exploiting structure of uncertainty for efficient matroid semi-bandits
ICML 2019
Scale-free adaptive planning for deterministic dynamics & discounted rewards
ICML 2019
Exact sampling of determinantal point processes with sublinear time preprocessing
NIPS 2019
A simple parameter-free and adaptive approach to optimization under a minimal local smoothness assumption
ALT 2019
Multiagent Evaluation under Incomplete Information
NIPS 2019
Rotting bandits are no harder than stochastic ones
AISTATS 2019
Finding the bandit in a graph: Sequential search-and-stop
AISTATS 2019
Gaussian Process Optimization with Adaptive Sketching: Scalable and No Regret
COLT 2019
On two ways to use determinantal point processes for Monte Carlo integration
NIPS 2019
Planning in entropy-regularized Markov decision processes and games
NIPS 2019
DPPy: DPP Sampling with Python
JMLR 2019
General parallel optimization a without metric
ALT 2019
Improved large-scale graph learning through ridge spectral sparsification
ICML 2018
Best of both worlds: Stochastic & adversarial best-arm identification
COLT 2018
Compressing the Input for CNNs with the First-Order Scattering Transform
ECCV 2018
Optimistic optimization of a Brownian
NIPS 2018
Zonotope Hit-and-run for Efficient Sampling from Projection DPPs
ICML 2017
Efficient Second-Order Online Kernel Learning with Adaptive Embedding
NIPS 2017
Distributed Adaptive Sampling for Kernel Matrix Approximation
AISTATS 2017
Trading off Rewards and Errors in Multi-Armed Bandits
AISTATS 2017
Online Influence Maximization under Independent Cascade Model with Semi-Bandit Feedback
NIPS 2017
Second-Order Kernel Online Convex Optimization with Adaptive Sketching
ICML 2017
Blazing the trails before beating the path: Sample-efficient Monte-Carlo planning
NIPS 2016
Online Learning with Noisy Side Observations
AISTATS 2016
Bayesian Policy Gradient and Actor-Critic Algorithms
JMLR 2016
Revealing Graph Bandits for Maximizing Local Influence
AISTATS 2016
Pliable Rejection Sampling
ICML 2016
Black-box optimization of noisy functions with unknown smoothness
NIPS 2015
Maximum Entropy Semi-Supervised Inverse Reinforcement Learning
IJCAI 2015
Simple regret for infinitely many armed bandits
ICML 2015
Cheap Bandits
ICML 2015
Spectral Bandits for Smooth Graph Functions
ICML 2014
Online combinatorial optimization with stochastic decision sets and adversarial losses
NIPS 2014
Extreme bandits
NIPS 2014
Efficient learning by implicit exploration in bandit problems with side observations
NIPS 2014
Stochastic Simultaneous Optimistic Optimization
ICML 2013
Semi-Supervised Learning with Max-Margin Graph Cuts
AISTATS 2010