Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Learning Types
Machine Learning
›
Learning Types
›
Multi-Armed Bandits
1044 directly classified papers
Papers per year
2002: 1
2006: 2
2007: 3
2008: 5
2009: 3
2010: 5
2011: 23
2012: 16
2013: 32
2014: 42
2015: 27
2016: 33
2017: 46
2018: 55
2019: 80
2020: 87
2021: 124
2022: 160
2023: 136
2024: 126
2025: 38
Papers
Regret Minimization with Performative Feedback
ICML 2022
Langevin Monte Carlo for Contextual Bandits
ICML 2022
Masked Language Models Know Which are Popular: A Simple Ranking Strategy for Commonsense Question Answering
EMNLP 2022
Versatile Dueling Bandits: Best-of-both World Analyses for Learning from Relative Preferences
ICML 2022
Adaptive Gating for Single-Photon 3D Imaging
CVPR 2022
Choosing Answers in Epsilon-Best-Answer Identification for Linear Bandits
ICML 2022
Inverse Contextual Bandits: Learning How Behavior Evolves over Time
ICML 2022
Adaptive Best-of-Both-Worlds Algorithm for Heavy-Tailed Multi-Armed Bandits
ICML 2022
Minimax Optimal Algorithms for Fixed-Budget Best Arm Identification
NIPS 2022
Instance Dependent Regret Analysis of Kernelized Bandits
ICML 2022
Off-Policy Evaluation for Large Action Spaces via Embeddings
ICML 2022
Smoothed Adversarial Linear Contextual Bandits with Knapsacks
ICML 2022
UniRank: Unimodal Bandit Algorithms for Online Ranking
ICML 2022
Breaking the $\sqrtT$ Barrier: Instance-Independent Logarithmic Regret in Stochastic Contextual Linear Bandits
ICML 2022
Distributionally-Aware Kernelized Bandit Problems for Risk Aversion
ICML 2022
Optimal and Efficient Dynamic Regret Algorithms for Non-Stationary Dueling Bandits
ICML 2022
Safe Exploration for Efficient Policy Evaluation and Comparison
ICML 2022
Best Heuristic Identification for Constraint Satisfaction
IJCAI 2022
Adversarial Attacks on Gaussian Process Bandits
ICML 2022
TRAttack: Text Rewriting Attack Against Text Retrieval
ACL 2022
Nonstochastic Bandits with Composite Anonymous Feedback
JMLR 2022
Multi-Agent Multi-Armed Bandits with Limited Communication
JMLR 2022
KL-UCB-Switch: Optimal Regret Bounds for Stochastic Bandits from Both a Distribution-Dependent and a Distribution-Free Viewpoints
JMLR 2022
No Weighted-Regret Learning in Adversarial Bandits with Delays
JMLR 2022
Efficient Change-Point Detection for Tackling Piecewise-Stationary Bandits
JMLR 2022
<
1
…
14
15
16
…
42
>