Sciweavers

ML
2002
ACM
154views Machine Learning» more  ML 2002»
13 years 4 months ago
Technical Update: Least-Squares Temporal Difference Learning
TD() is a popular family of algorithms for approximate policy evaluation in large MDPs. TD() works by incrementally updating the value function after each observed transition. It h...
Justin A. Boyan
ICML
1999
IEEE
14 years 5 months ago
Least-Squares Temporal Difference Learning
Excerpted from: Boyan, Justin. Learning Evaluation Functions for Global Optimization. Ph.D. thesis, Carnegie Mellon University, August 1998. (Available as Technical Report CMU-CS-...
Justin A. Boyan