policy evaluation

115 papers

Explore in graph

Also known as

OPE

Co-occurring keywords

reinforcement learning (4122) temporal difference learning (149) value function (294) offline reinforcement learning (492) causal inference (1619) function approximation (319) off-policy learning (227) markov decision process (788) temporal-difference learning (42) linear function approximation (101)

Papers

Policy Evaluation Using the Ω-Return NIPS 2015

A Deeper Look at Planning as Learning from Replay ICML 2015

Quasi Newton Temporal Difference Learning ACML 2014

Policy Evaluation with Temporal Differences: A Survey and Comparison JMLR 2014

Temporal Difference Methods for the Variance of the Reward To Go ICML 2013

Bellman Error Based Feature Generation using Random Projections on Sparse Spaces NIPS 2013

Finite-Sample Analysis of Least-Squares Policy Iteration JMLR 2012

On Average Reward Policy Evaluation in Infinite-State Partially Observable Systems AISTATS 2012

A Non-Parametric Approach to Dynamic Programming NIPS 2011

Generalized TD Learning JMLR 2011

Linear Complementarity for Regularized Policy Evaluation and Improvement NIPS 2010

Model-Free Monte Carlo-like Policy Evaluation AISTATS 2010

Multi-Step Dyna Planning for Policy Evaluation and Control NIPS 2009

A Convergent $O(n)$ Temporal-difference Algorithm for Off-policy Learning with Linear Function Approximation NIPS 2008

iLSTD: Eligibility Traces and Convergence Analysis NIPS 2006