Co-occurring keywords
Papers
Improved Algorithm for Adversarial Linear Mixture MDPs with Bandit Feedback and Unknown Transition
AISTATS 2024
Controlgym: Large-scale control environments for benchmarking reinforcement learning algorithms
L4DC 2024
OVD-Explorer: Optimism Should Not Be the Sole Pursuit of Exploration in Noisy Environments
AAAI 2024
Policy Evaluation for Reinforcement Learning from Human Feedback: A Sample Complexity Analysis
AISTATS 2024
A Bayesian Learning Algorithm for Unknown Zero-sum Stochastic Games with an Arbitrary Opponent
AISTATS 2024
PDE control gym: A benchmark for data-driven boundary control of partial differential equations
L4DC 2024
Discerning Temporal Difference Learning
AAAI 2024