Sciweavers

18 search results - page 1 / 4
» Incremental Least Squares Policy Iteration for POMDPs
Sort
View
AAAI
2006
13 years 6 months ago
Incremental Least Squares Policy Iteration for POMDPs
We present a new algorithm, called incremental least squares policy iteration (ILSPI), for finding the infinite-horizon stationary policy for partially observable Markov decision ...
Hui Li, Xuejun Liao, Lawrence Carin
ECAI
2006
Springer
13 years 8 months ago
Least Squares SVM for Least Squares TD Learning
Abstract. We formulate the problem of least squares temporal difference learning (LSTD) in the framework of least squares SVM (LS-SVM). To cope with the large amount (and possible ...
Tobias Jung, Daniel Polani
PKDD
2009
Springer
169views Data Mining» more  PKDD 2009»
13 years 11 months ago
Hybrid Least-Squares Algorithms for Approximate Policy Evaluation
The goal of approximate policy evaluation is to “best” represent a target value function according to a specific criterion. Temporal difference methods and Bellman residual m...
Jeffrey Johns, Marek Petrik, Sridhar Mahadevan
ICML
1999
IEEE
14 years 5 months ago
Least-Squares Temporal Difference Learning
Excerpted from: Boyan, Justin. Learning Evaluation Functions for Global Optimization. Ph.D. thesis, Carnegie Mellon University, August 1998. (Available as Technical Report CMU-CS-...
Justin A. Boyan
NIPS
2001
13 years 6 months ago
Model-Free Least-Squares Policy Iteration
We propose a new approach to reinforcement learning which combines least squares function approximation with policy iteration. Our method is model-free and completely off policy. ...
Michail G. Lagoudakis, Ronald Parr