conftrace_
2009 NIPS NeurIPS 2009

Robust Value Function Approximation Using Bilinear Programming

Abstract

Existing value function approximation methods have been successfully used in many applications, but they often lack useful a priori error bounds. We propose approximate bilinear programming, a new formulation of value function approximation that provides strong a priori guarantees. In particular, it provably finds an approximate value function that minimizes the Bellman residual. Solving a bilinear program optimally is NP hard, but this is unavoidable because the Bellman-residual minimization itself is NP hard. We, therefore, employ and analyze a common approximate algorithm for bilinear programs. The analysis shows that this algorithm offers a convergent generalization of approximate policy iteration. Finally, we demonstrate that the proposed approach can consistently minimize the Bellman residual on a simple benchmark problem.

🌉 Interdisciplinary Bridge - Machine Learning and Reinforcement Learning
📈 Trend Setter - Value Iteration
🧭 Keyword Pioneer - bilinear programming
🐝 Cross-Pollinator - Artificial Intelligence, Computer Vision, Data Science & Analytics, Healthcare & Medicine, Machine Learning, Mathematics & Optimization, Reinforcement Learning
🐣 Hot Topic Early Bird - reinforcement learning