Sciweavers

11 search results - page 1 / 3
» Model-Free Least-Squares Policy Iteration
Sort
View
NIPS
2001
13 years 5 months ago
Model-Free Least-Squares Policy Iteration
We propose a new approach to reinforcement learning which combines least squares function approximation with policy iteration. Our method is model-free and completely off policy. ...
Michail G. Lagoudakis, Ronald Parr
ECAI
2006
Springer
13 years 8 months ago
Least Squares SVM for Least Squares TD Learning
Abstract. We formulate the problem of least squares temporal difference learning (LSTD) in the framework of least squares SVM (LS-SVM). To cope with the large amount (and possible ...
Tobias Jung, Daniel Polani
AAAI
2006
13 years 6 months ago
Incremental Least Squares Policy Iteration for POMDPs
We present a new algorithm, called incremental least squares policy iteration (ILSPI), for finding the infinite-horizon stationary policy for partially observable Markov decision ...
Hui Li, Xuejun Liao, Lawrence Carin
PKDD
2009
Springer
169views Data Mining» more  PKDD 2009»
13 years 11 months ago
Hybrid Least-Squares Algorithms for Approximate Policy Evaluation
The goal of approximate policy evaluation is to “best” represent a target value function according to a specific criterion. Temporal difference methods and Bellman residual m...
Jeffrey Johns, Marek Petrik, Sridhar Mahadevan