← Learning Types

Machine Learning › Learning Types ›

Multi-Armed Bandits

1044 directly classified papers

Papers per year

Papers

Regret Minimization with Performative Feedback ICML 2022

Langevin Monte Carlo for Contextual Bandits ICML 2022

Masked Language Models Know Which are Popular: A Simple Ranking Strategy for Commonsense Question Answering EMNLP 2022

Versatile Dueling Bandits: Best-of-both World Analyses for Learning from Relative Preferences ICML 2022

Adaptive Gating for Single-Photon 3D Imaging CVPR 2022

Choosing Answers in Epsilon-Best-Answer Identification for Linear Bandits ICML 2022

Inverse Contextual Bandits: Learning How Behavior Evolves over Time ICML 2022

Adaptive Best-of-Both-Worlds Algorithm for Heavy-Tailed Multi-Armed Bandits ICML 2022

Minimax Optimal Algorithms for Fixed-Budget Best Arm Identification NIPS 2022

Instance Dependent Regret Analysis of Kernelized Bandits ICML 2022

Off-Policy Evaluation for Large Action Spaces via Embeddings ICML 2022

Smoothed Adversarial Linear Contextual Bandits with Knapsacks ICML 2022

UniRank: Unimodal Bandit Algorithms for Online Ranking ICML 2022

Breaking the $\sqrtT$ Barrier: Instance-Independent Logarithmic Regret in Stochastic Contextual Linear Bandits ICML 2022

Distributionally-Aware Kernelized Bandit Problems for Risk Aversion ICML 2022

Optimal and Efficient Dynamic Regret Algorithms for Non-Stationary Dueling Bandits ICML 2022

Safe Exploration for Efficient Policy Evaluation and Comparison ICML 2022

Best Heuristic Identification for Constraint Satisfaction IJCAI 2022

Adversarial Attacks on Gaussian Process Bandits ICML 2022

TRAttack: Text Rewriting Attack Against Text Retrieval ACL 2022

Nonstochastic Bandits with Composite Anonymous Feedback JMLR 2022

Multi-Agent Multi-Armed Bandits with Limited Communication JMLR 2022

KL-UCB-Switch: Optimal Regret Bounds for Stochastic Bandits from Both a Distribution-Dependent and a Distribution-Free Viewpoints JMLR 2022

No Weighted-Regret Learning in Adversarial Bandits with Delays JMLR 2022

Efficient Change-Point Detection for Tackling Piecewise-Stationary Bandits JMLR 2022