Sciweavers

SIAMCO
2000

The O.D.E. Method for Convergence of Stochastic Approximation and Reinforcement Learning

13 years 4 months ago
The O.D.E. Method for Convergence of Stochastic Approximation and Reinforcement Learning
It is shown here that stability of the stochastic approximation algorithm is implied by the asymptotic stability of the origin for an associated ODE. This in turn implies convergence of the algorithm. Several specific classes of algorithms are considered as applications. It is found that the results provide (i) a simpler derivation of known results for reinforcement learning algorithms; (ii) a proof for the first time that a class of asynchronous stochastic approximation algorithms are convergent without using any a priori assumption of stability; (iii) a proof for the first time that asynchronous adaptive critic and Q-learning algorithms are convergent for the average cost optimal control problem. Key words. stochastic approximation, ODE method, stability, asynchronous algorithms, reinforcement learning AMS subject classifications. 62L20, 93E25, 93E15 PII. S0363012997331639
Vivek S. Borkar, Sean P. Meyn
Added 19 Dec 2010
Updated 19 Dec 2010
Type Journal
Year 2000
Where SIAMCO
Authors Vivek S. Borkar, Sean P. Meyn
Comments (0)