Sciweavers

24 search results - page 1 / 5
» Technical Update: Least-Squares Temporal Difference Learning
Sort
View
ML
2002
ACM
154views Machine Learning» more  ML 2002»
13 years 4 months ago
Technical Update: Least-Squares Temporal Difference Learning
TD() is a popular family of algorithms for approximate policy evaluation in large MDPs. TD() works by incrementally updating the value function after each observed transition. It h...
Justin A. Boyan
ICML
1999
IEEE
14 years 5 months ago
Least-Squares Temporal Difference Learning
Excerpted from: Boyan, Justin. Learning Evaluation Functions for Global Optimization. Ph.D. thesis, Carnegie Mellon University, August 1998. (Available as Technical Report CMU-CS-...
Justin A. Boyan
ECAI
2006
Springer
13 years 8 months ago
Least Squares SVM for Least Squares TD Learning
Abstract. We formulate the problem of least squares temporal difference learning (LSTD) in the framework of least squares SVM (LS-SVM). To cope with the large amount (and possible ...
Tobias Jung, Daniel Polani
AAAI
2006
13 years 6 months ago
Incremental Least-Squares Temporal Difference Learning
Alborz Geramifard, Michael H. Bowling, Richard S. ...
ICML
2010
IEEE
13 years 5 months ago
Convergence of Least Squares Temporal Difference Methods Under General Conditions
We consider approximate policy evaluation for finite state and action Markov decision processes (MDP) in the off-policy learning context and with the simulation-based least square...
Huizhen Yu