conftrace_

Michal Valko

101 papers · 2010–2025 · 10 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓

+14 more ↓

🗺️ Taxonomy Completionist (31) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🌈 Renaissance Researcher (5) 🐣 Hot Topic Early Bird

🐝 Cross-Pollinator (9) 🌉 Interdisciplinary Bridge 🌍 Conference Polyglot (10) 🏠 Conference Loyalist (27) 🏆 Keyword Champion (2) 👑 Triple Crown 🔬 Deep Specialist (10) 🤝 Dynamic Duo (35) 🗃️ Keyword Collector (107) 📈 Trend Setter 🚀 Conference Pioneer 🔥 Unstoppable (13) ⚡ Prolific Year (19) 💎 Century Club (101)

Conferences

ICML (41) NIPS (27) AISTATS (16) ALT (5) COLT (3) ICLR (3) JMLR (3) ECCV (1) ICCV (1) IJCAI (1)

Top co-authors

Rémi Munos (35) Pierre Menard (22) Daniele Calandriello (21) Alessandro Lazaric (17) Yunhao Tang (17) Mark Rowland (14) Omar Darwiche Domingues (9) Pierre Perrault (9) Jean-Bastien Grill (8) Alexandra Carpentier (8)

Research topics

Keywords

regret bound (29) multi-armed bandit (18) online learning (16) sample complexity (14) reinforcement learning (11) markov decision process (10) stochastic optimization (8) online algorithm (6) regret minimization (6) value function (5) upper confidence bound (5) determinantal point process (5) policy optimization (5) self-supervised learning (5) global optimization (4) gaussian process (4) deep reinforcement learning (4) active learning (4) cumulative regret (4) value iteration (3)

Papers

The Harder Path: Last Iterate Convergence for Uncoupled Learning in Zero-Sum Games with Bandit Feedback ICML 2025 Metacognitive Capabilities of LLMs: An Exploration in Mathematical Problem Solving NIPS 2024 Unlocking the Power of Representations in Long-term Novelty-based Exploration ICLR 2024 Demonstration-Regularized RL ICLR 2024 A General Theoretical Paradigm to Understand Learning from Human Preferences AISTATS 2024 Generalized Preference Optimization: A Unified Approach to Offline Alignment ICML 2024 Nash Learning from Human Feedback ICML 2024 Decoding-time Realignment of Language Models ICML 2024 Human Alignment of Large Language Models through Online Preference Optimisation ICML 2024 Local and Adaptive Mirror Descents in Extensive-Form Games NIPS 2024 Quantile Credit Assignment ICML 2023 Regularization and Variance-Weighted Regression Achieves Minimax Optimality in Linear MDPs: Theory and Practice ICML 2023 DoMo-AC: Doubly Multi-step Off-policy Actor-Critic Algorithm ICML 2023 VA-learning as a more efficient alternative to Q-learning ICML 2023 Fast Rates for Maximum Entropy Exploration ICML 2023 Model-free Posterior Sampling via Learning Rate Randomization NIPS 2023 Understanding Self-Predictive Learning for Reinforcement Learning ICML 2023 Adapting to game trees in zero-sum imperfect information games ICML 2023 Half-Hop: A graph upsampling approach for slowing down message passing ICML 2023 Curiosity in Hindsight: Intrinsic Exploration in Stochastic Environments ICML 2023 Optimistic Posterior Sampling for Reinforcement Learning with Few Samples and Tight Guarantees NIPS 2022 Large-Scale Representation Learning on Graphs via Bootstrapping ICLR 2022 BYOL-Explore: Exploration by Bootstrapped Prediction NIPS 2022 From Dirichlet to Rubin: Optimistic Exploration in RL without Bonuses ICML 2022 Retrieval-Augmented Reinforcement Learning ICML 2022 Scaling Gaussian Process Optimization by Evaluating a Few Unique Candidates Multiple Times ICML 2022 Marginalized Operators for Off-policy Reinforcement Learning AISTATS 2022 Adaptive Multi-Goal Exploration AISTATS 2022 Unifying Gradient Estimators for Meta-Reinforcement Learning via Off-Policy Evaluation NIPS 2021 Broaden Your Views for Self-Supervised Video Learning ICCV 2021 A Kernel-Based Approach to Non-Stationary Reinforcement Learning in Metric Spaces AISTATS 2021 Episodic Reinforcement Learning in Finite MDPs: Minimax Lower Bounds Revisited ALT 2021 Revisiting Peng’s Q($λ$) for Modern Reinforcement Learning ICML 2021 Taylor Expansion of Discount Factors ICML 2021 UCB Momentum Q-learning: Correcting the bias without forgetting ICML 2021 Kernel-Based Reinforcement Learning: A Finite-Time Analysis ICML 2021 Sample Complexity Bounds for Stochastic Shortest Path with a Generative Model ALT 2021 Adaptive Reward-Free Exploration ALT 2021 Online A-Optimal Design and Active Linear Regression ICML 2021 Fast active learning for pure exploration in reinforcement learning ICML 2021 Learning in two-player zero-sum partially observable Markov games with perfect recall NIPS 2021 Drop, Swap, and Generate: A Self-Supervised Approach for Generating Neural Activity NIPS 2021 A Provably Efficient Sample Collection Strategy for Reinforcement Learning NIPS 2021 Stochastic Shortest Path: Minimax, Parameter-Free and Towards Horizon-Free Regret NIPS 2021 A single algorithm for both restless and rested rotting bandits AISTATS 2020 Planning in Markov Decision Processes with Gap-Dependent Sample Complexity NIPS 2020 Statistical Efficiency of Thompson Sampling for Combinatorial Semi-Bandits NIPS 2020 Sampling from a k-DPP without looking at all items NIPS 2020 Improved Sample Complexity for Incremental Autonomous Exploration in MDPs NIPS 2020 Bootstrap Your Own Latent - A New Approach to Self-Supervised Learning NIPS 2020 Fixed-confidence guarantees for Bayesian best-arm identification AISTATS 2020 Derivative-Free & Order-Robust Optimisation AISTATS 2020 Adaptive multi-fidelity optimization with fast learning rates AISTATS 2020 Covariance-adapting algorithm for semi-bandits with application to sparse outcomes COLT 2020 Near-linear time Gaussian process optimization with adaptive batching and resparsification ICML 2020 Gamification of Pure Exploration for Linear Bandits ICML 2020 Stochastic bandits with arm-dependent delays ICML 2020 Monte-Carlo Tree Search as Regularized Policy Optimization ICML 2020 Budgeted Online Influence Maximization ICML 2020 Improved Sleeping Bandits with Stochastic Action Sets and Adversarial Rewards ICML 2020 Taylor Expansion Policy Optimization ICML 2020 No-Regret Exploration in Goal-Oriented Reinforcement Learning ICML 2020 Spectral bandits JMLR 2020 Active multiple matrix completion with adaptive confidence sets AISTATS 2019 Exploiting structure of uncertainty for efficient matroid semi-bandits ICML 2019 Scale-free adaptive planning for deterministic dynamics & discounted rewards ICML 2019 Exact sampling of determinantal point processes with sublinear time preprocessing NIPS 2019 A simple parameter-free and adaptive approach to optimization under a minimal local smoothness assumption ALT 2019 Multiagent Evaluation under Incomplete Information NIPS 2019 Rotting bandits are no harder than stochastic ones AISTATS 2019 Finding the bandit in a graph: Sequential search-and-stop AISTATS 2019 Gaussian Process Optimization with Adaptive Sketching: Scalable and No Regret COLT 2019 On two ways to use determinantal point processes for Monte Carlo integration NIPS 2019 Planning in entropy-regularized Markov decision processes and games NIPS 2019 DPPy: DPP Sampling with Python JMLR 2019 General parallel optimization a without metric ALT 2019 Improved large-scale graph learning through ridge spectral sparsification ICML 2018 Best of both worlds: Stochastic & adversarial best-arm identification COLT 2018 Compressing the Input for CNNs with the First-Order Scattering Transform ECCV 2018 Optimistic optimization of a Brownian NIPS 2018 Zonotope Hit-and-run for Efficient Sampling from Projection DPPs ICML 2017 Efficient Second-Order Online Kernel Learning with Adaptive Embedding NIPS 2017 Distributed Adaptive Sampling for Kernel Matrix Approximation AISTATS 2017 Trading off Rewards and Errors in Multi-Armed Bandits AISTATS 2017 Online Influence Maximization under Independent Cascade Model with Semi-Bandit Feedback NIPS 2017 Second-Order Kernel Online Convex Optimization with Adaptive Sketching ICML 2017 Blazing the trails before beating the path: Sample-efficient Monte-Carlo planning NIPS 2016 Online Learning with Noisy Side Observations AISTATS 2016 Bayesian Policy Gradient and Actor-Critic Algorithms JMLR 2016 Revealing Graph Bandits for Maximizing Local Influence AISTATS 2016 Pliable Rejection Sampling ICML 2016 Black-box optimization of noisy functions with unknown smoothness NIPS 2015 Maximum Entropy Semi-Supervised Inverse Reinforcement Learning IJCAI 2015 Simple regret for infinitely many armed bandits ICML 2015 Cheap Bandits ICML 2015 Spectral Bandits for Smooth Graph Functions ICML 2014 Online combinatorial optimization with stochastic decision sets and adversarial losses NIPS 2014 Extreme bandits NIPS 2014 Efficient learning by implicit exploration in bandit problems with side observations NIPS 2014 Stochastic Simultaneous Optimistic Optimization ICML 2013 Semi-Supervised Learning with Max-Margin Graph Cuts AISTATS 2010