conftrace_

reinforcement learning

4352 papers

Explore in graph

Also known as

RL REINFORCE

Co-occurring keywords

large language model (13587) policy learning (702) markov decision process (790) policy optimization (657) policy gradient (520) deep reinforcement learning (903) multi-agent system (1819) imitation learning (744) regret bound (1926) language model (4599)

Papers

The Fixed Points of Off-Policy TD NIPS 2011

Transfer from Multiple MDPs NIPS 2011

TD_gamma: Re-evaluating Complex Backups in Temporal Difference Learning NIPS 2011

Blending Autonomous Exploration and Apprenticeship Learning NIPS 2011

A Reinforcement Learning Theory for Homeostatic Regulation NIPS 2011

Agnostic KWIK learning and efficient approximate reinforcement learning COLT 2011

Speedy Q-Learning NIPS 2011

Improving Policy Gradient Estimates with Influence Information ACML 2011

A reinterpretation of the policy oscillation phenomenon in approximate policy iteration NIPS 2011

Selecting the State-Representation in Reinforcement Learning NIPS 2011

Robust Approximate Bilinear Programming for Value Function Approximation JMLR 2011

Generalized TD Learning JMLR 2011

Optimal Reinforcement Learning for Gaussian Systems NIPS 2011

Exploiting Best-Match Equations for Efficient Reinforcement Learning JMLR 2011

Learning to Agglomerate Superpixel Hierarchies NIPS 2011

Analysis and Improvement of Policy Gradient Estimation NIPS 2011

Convergent Fitted Value Iteration with Linear Function Approximation NIPS 2011

Policy Gradient Coagent Networks NIPS 2011

Dynamic Policy Programming with Function Approximation AISTATS 2011

Action-Gap Phenomenon in Reinforcement Learning NIPS 2011

A Convergent Online Single Time Scale Actor Critic Algorithm JMLR 2010

Effects of Synaptic Weight Diffusion on Learning in Decision Making Networks NIPS 2010

Double Q-learning NIPS 2010

Model-Free Monte Carlo-like Policy Evaluation AISTATS 2010

Predictive State Temporal Difference Learning NIPS 2010