Haipeng Luo
88 papers · 2014–2025 · 9 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+14 more ↓ Show less ↑
π§ Keyword Pioneer π£ Hot Topic Early Bird πΊοΈ Taxonomy Completionist (17) π Interdisciplinary Bridge π Conference Polyglot (9)
π
Renaissance Researcher
(7)
πΊοΈ
Taxonomy Completionist
(17)
π§
Keyword Pioneer
π
Conference Loyalist
(33)
π
Keyword Champion
(32)
π
Triple Crown
π§¬
Topic Evolution
π¬
Deep Specialist
(36)
π€
Dynamic Duo
(24)
π₯
Unstoppable
(12)
β‘
Prolific Year
(11)
β
The Questioner
(2)
π
Century Club
(88)
ποΈ
Keyword Collector
(66)
Conferences
NIPS (33)
COLT (22)
ICML (18)
AISTATS (5)
ICLR (3)
ALT (2)
CVPR (2)
UAI (2)
IJCAI (1)
Top co-authors
Keywords
regret bound
(48)
online learning
(32)
contextual bandit
(14)
multi-armed bandit
(13)
markov decision process
(8)
game theory
(8)
bandit feedback
(7)
online mirror descent
(7)
stochastic shortest path
(7)
dynamic regret
(7)
adversarial learning
(6)
multi-agent system
(6)
stochastic optimization
(5)
nash equilibrium
(5)
adversarial bandit
(5)
reinforcement learning
(5)
online algorithm
(4)
no-regret learning
(4)
linear bandit
(4)
policy optimization
(4)
Papers
Corrupted Learning Dynamics in Games
COLT 2025
Contextual Linear Bandits with Delay as Payoff
ICML 2025
Alternating Regret for Online Convex Optimization
COLT 2025
Instance-Dependent Regret Bounds for Learning Two-Player Zero-Sum Games with Bandit Feedback
COLT 2025
Last-Iterate Convergence Properties of Regret-Matching Algorithms in Games
ICLR 2025
WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct
ICLR 2025
Provably Efficient Interactive-Grounded Learning with Personalized Reward
NIPS 2024
WizardArena: Post-training Large Language Models via Simulated Offline Chatbot Arena
NIPS 2024
Efficient Contextual Bandits with Uninformed Feedback Graphs
ICML 2024
Near-Optimal Regret in Linear MDPs with Aggregate Bandit Feedback
ICML 2024
No-Regret Learning for Fair Multi-Agent Social Welfare Optimization
NIPS 2024
Fast Last-Iterate Convergence of Learning in Games Requires Forgetful Algorithms
NIPS 2024
Contextual Multinomial Logit Bandits with General Value Functions
NIPS 2024
Optimal Multiclass U-Calibration Error and Beyond
NIPS 2024
Near-Optimal Policy Optimization for Correlated Equilibrium in General-Sum Markov Games
AISTATS 2024
Online Learning in Contextual Second-Price Pay-Per-Click Auctions
AISTATS 2024
ACPO: A Policy Optimization Algorithm for Average MDPs with Constraints
ICML 2024
On Tractable $\Phi$-Equilibria in Non-Concave Games
NIPS 2024
Cap4Video: What Can Auxiliary Captions Do for Text-Video Retrieval?
CVPR 2023
Posterior sampling-based online learning for the stochastic shortest path model
UAI 2023
Practical Contextual Bandits with Feedback Graphs
NIPS 2023
Improved Best-of-Both-Worlds Guarantees for Multi-Armed Bandits: FTRL with General Regularizers and Multiple Optimal Arms
NIPS 2023
Uncoupled and Convergent Learning in Two-Player Zero-Sum Markov Games with Bandit Feedback
NIPS 2023
No-Regret Online Reinforcement Learning with Adversarial Losses and Transitions
NIPS 2023
Regret Matching+: (In)Stability and Fast Convergence in Games
NIPS 2023
No-Regret Learning in Two-Echelon Supply Chain with Unknown Demand Distribution
AISTATS 2023
Refined Regret for Adversarial MDPs with Linear Function Approximation
ICML 2023
Improved High-Probability Regret for Adversarial Bandits with Time-Varying Feedback Graphs
ALT 2023
Bidirectional Cross-Modal Knowledge Exploration for Video Recognition With Pre-Trained Vision-Language Models
CVPR 2023
Uncoupled Learning Dynamics with $O(\log T)$ Swap Regret in Multiplayer Games
NIPS 2022
Improved No-Regret Algorithms for Stochastic Shortest Path with Linear MDP
ICML 2022
Learning Infinite-horizon Average-reward Markov Decision Process with Constraints
ICML 2022
Near-Optimal No-Regret Learning Dynamics for General Convex Games
NIPS 2022
Kernelized Multiplicative Weights for 0/1-Polyhedral Games: Bridging the Gap Between Learning in Extensive-Form and Normal-Form Games
ICML 2022
No-Regret Learning in Time-Varying Zero-Sum Games
ICML 2022
Policy Optimization for Stochastic Shortest Path
COLT 2022
Corralling a Larger Band of Bandits: A Case Study on Switching Regret for Linear Bandits
COLT 2022
Adaptive Bandit Convex Optimization with Heterogeneous Curvature
COLT 2022
Near-Optimal Goal-Oriented Reinforcement Learning in Non-Stationary Environments
NIPS 2022
Near-Optimal Regret for Adversarial MDP with Delayed Bandit Feedback
NIPS 2022
Follow-the-Perturbed-Leader for Adversarial Markov Decision Processes with Bandit Feedback
NIPS 2022
Active Online Learning with Hidden Shifting Domains
AISTATS 2021
Implicit Finite-Horizon Approximation and Efficient Optimal Algorithms for Stochastic Shortest Path
NIPS 2021
Last-iterate Convergence in Extensive-Form Games
NIPS 2021
The best of both worlds: stochastic and adversarial episodic MDPs with unknown transition
NIPS 2021
Policy Optimization in Adversarial MDPs: Improved Exploration via Dilated Bonuses
NIPS 2021
Learning Infinite-horizon Average-reward MDPs with Linear Function Approximation
AISTATS 2021
Adversarial Online Learning with Changing Action Sets: Efficient Algorithms with Approximate Regret Bounds
ALT 2021
Minimax Regret for Stochastic Shortest Path with Adversarial Costs and Known Transition
COLT 2021
Impossible Tuning Made Possible: A New Expert Algorithm and Its Applications
COLT 2021
Last-iterate Convergence of Decentralized Optimistic Gradient Descent/Ascent in Infinite-horizon Competitive Markov Games
COLT 2021
Non-stationary Reinforcement Learning without Prior Knowledge: an Optimal Black-box Approach
COLT 2021
Linear Last-iterate Convergence in Constrained Saddle-point Optimization
ICLR 2021
Finding the Stochastic Shortest Path with Low Regret: the Adversarial Cost and Unknown Transition Case
ICML 2021
Achieving Near Instance-Optimality and Minimax-Optimality in Stochastic and Adversarial Linear Bandits Simultaneously
ICML 2021
Bias no more: high-probability data-dependent regret bounds for adversarial bandits and MDPs
NIPS 2020
Open Problem: Model Selection for Contextual Bandits
COLT 2020
Comparator-Adaptive Convex Bandits
NIPS 2020
Taking a hint: How to leverage loss predictors in contextual bandits?
COLT 2020
A Closer Look at Small-loss Bounds for Bandits with Graph Feedback
COLT 2020
Learning Adversarial Markov Decision Processes with Bandit Feedback and Unknown Transition
ICML 2020
Model-free Reinforcement Learning in Infinite-horizon Average-reward Markov Decision Processes
ICML 2020
Fair Contextual Multi-Armed Bandits: Theory and Experiments
UAI 2020
Simultaneously Learning Stochastic and Adversarial Episodic MDPs with Known Transition
NIPS 2020
Improved Path-length Regret Bounds for Bandits
COLT 2019
Beating Stochastic and Adversarial Semi-bandits Optimally and Simultaneously
ICML 2019
Model Selection for Contextual Bandits
NIPS 2019
A New Algorithm for Non-stationary Contextual Bandits: Efficient, Optimal and Parameter-free
COLT 2019
Achieving Optimal Dynamic Regret for Non-stationary Bandits without Prior Information
COLT 2019
Equipping Experts/Bandits with Long-term Memory
NIPS 2019
Hypothesis Set Stability and Generalization
NIPS 2019
Efficient Online Portfolio with Logarithmic Regret
NIPS 2018
Practical Contextual Bandits with Regression Oracles
ICML 2018
Efficient Contextual Bandits in Non-stationary Worlds
COLT 2018
More Adaptive Algorithms for Adversarial Bandits
COLT 2018
Logistic Regression: The Importance of Being Improper
COLT 2018
Corralling a Band of Bandit Algorithms
COLT 2017
Open Problem: First-Order Regret Bounds for Contextual Bandits
COLT 2017
Efficient Second Order Online Learning by Sketching
NIPS 2016
Optimal and Adaptive Algorithms for Online Boosting
IJCAI 2016
Variance-Reduced and Projection-Free Stochastic Optimization
ICML 2016
Improved Regret Bounds for Oracle-Based Adversarial Contextual Bandits
NIPS 2016
Optimal and Adaptive Algorithms for Online Boosting
ICML 2015
Achieving All with No Parameters: AdaNormalHedge
COLT 2015
Fast Convergence of Regularized Learning in Games
NIPS 2015
Online Gradient Boosting
NIPS 2015
Towards Minimax Online Learning with Unknown Time Horizon
ICML 2014
A Drifting-Games Analysis for Online Learning and Applications to Boosting
NIPS 2014