Sciweavers

Free Online Productivity Tools i2Speak i2Symbol i2OCR iTex2Img iWeb2Print iWeb2Shot i2Type iPdf2Split iPdf2Merge i2Bopomofo i2Arabic i2Style i2Image i2PDF iLatex2Rtf Sci2ools

17

ICML
2010
IEEE

favoriteEmaildiscussreport

219views Machine Learning» more ICML 2010»

Convergence of Least Squares Temporal Difference Methods Under General Conditions

13 years 5 months ago

Convergence of Least Squares Temporal Difference Methods Under General Conditions

Download www.cs.helsinki.fi

We consider approximate policy evaluation for finite state and action Markov decision processes (MDP) in the off-policy learning context and with the simulation-based least squares temporal difference algorithm, LSTD(). We establish for the discounted cost criterion that the off-policy LSTD() converges almost surely under mild, minimal conditions. We also analyze other convergence and boundedness properties of the iterates involved in the algorithm, and based on them, we suggest a modification in its practical implementation. Our analysis uses theories of both finite space Markov chains and Markov chains on topological spaces.

Huizhen Yu

Real-time Traffic

Action Markov Decision | Finite Space Markov | ICML 2010 | Machine Learning | Markov Chains |

claim paper

Related Content

» Global Convergence of a New Hybrid GaussNewton Structured BFGS Method for Nonlinear Least ...

» Convergence analysis of a proximal GaussNewton method

» Efficient Reinforcement Learning Using Recursive LeastSquares Methods

» Hierarchical Least Squares Conformal Map

» A Total Least Squares Framework for LowLevel Analysis of Dynamic Scenes and Processes

» A truncated least squares approach to the detection of specular highlights in color images

» Minimizing the Condition Number of a Gram Matrix

» Least Squares Estimation Without Priors or Supervision

» Recursive least squares dictionary learning algorithm

Post Info
More Details (n/a)

Added	09 Nov 2010
Updated	09 Nov 2010
Type	Conference
Year	2010
Where	ICML
Authors	Huizhen Yu

Comments (0)