← Learning Types

Machine Learning › Learning Types ›

Multi-Armed Bandits

1044 directly classified papers

Papers per year

Papers

Minimax Optimal Algorithms for Fixed-Budget Best Arm Identification NIPS 2022

Smoothed Adversarial Linear Contextual Bandits with Knapsacks ICML 2022

Trading Off Resource Budgets For Improved Regret Bounds NIPS 2022

Optimal and Efficient Dynamic Regret Algorithms for Non-Stationary Dueling Bandits ICML 2022

Breaking the $\sqrtT$ Barrier: Instance-Independent Logarithmic Regret in Stochastic Contextual Linear Bandits ICML 2022

Best Heuristic Identification for Constraint Satisfaction IJCAI 2022

Factored DRO: Factored Distributionally Robust Policies for Contextual Bandits NIPS 2022

Deep Hierarchy in Bandits ICML 2022

A Reduction from Linear Contextual Bandits Lower Bounds to Estimations Lower Bounds ICML 2022

Optimistic Posterior Sampling for Reinforcement Learning with Few Samples and Tight Guarantees NIPS 2022

Adaptive Best-of-Both-Worlds Algorithm for Heavy-Tailed Multi-Armed Bandits ICML 2022

Regret Minimization with Performative Feedback ICML 2022

Inverse Contextual Bandits: Learning How Behavior Evolves over Time ICML 2022

IMED-RL: Regret optimal learning of ergodic Markov decision processes NIPS 2022

Towards Off-Policy Learning for Ranking Policies with Logged Feedback AAAI 2022

Differentially Private Regret Minimization in Episodic Markov Decision Processes AAAI 2022

Fixed-Budget Best-Arm Identification in Structured Bandits IJCAI 2022

Effective Dimension in Bandit Problems under Censorship NIPS 2022

Instance Dependent Regret Analysis of Kernelized Bandits ICML 2022

Multi-slots Online Matching with High Entropy ICML 2022

No-regret learning in games with noisy feedback: Faster rates and adaptivity via learning rate separation NIPS 2022

A Unifying Theory of Thompson Sampling for Continuous Risk-Averse Bandits AAAI 2022

Learning in Congestion Games with Bandit Feedback NIPS 2022

When Combinatorial Thompson Sampling meets Approximation Regret NIPS 2022

An $\alpha$-No-Regret Algorithm For Graphical Bilinear Bandits NIPS 2022