Geometric Variance Reduction in Markov Chains: Application to Value Function and Gradient Estimation

13 years 4 months ago

Download www.aaai.org

We study a sequential variance reduction technique for Monte Carlo estimation of functionals in Markov Chains. The method is based on designing sequential control variates using successive approximations of the function of interest V . Regular Monte Carlo estimates have a variance of O(1/N), where N is the number of samples. Here, we obtain a geometric variance reduction O(N ) (with < 1) up to a threshold that depends on the approximation error V - AV , where A is an approximation operator linear in the values. Thus, if V belongs to the right approximation space (i.e. AV = V ), the variance decreases geometrically to zero. An immediate application is value function estimation in Markov chains, which may be used for policy evaluation in policy iteration for Markov Decision Processes. Another important domain, for which variance reduction is highly needed, is gradient estimation, that is computing the sensitivity V of the performance measure V with respect to some parameter of the ...

Rémi Munos

Real-time Traffic

Approximation | Geometric Variance Reduction | JMLR 2006 | Variance Reduction |

claim paper

Added	13 Dec 2010
Updated	13 Dec 2010
Type	Journal
Year	2006
Where	JMLR
Authors	Rémi Munos

Sciweavers

Geometric Variance Reduction in Markov Chains: Application to Value Function and Gradient Estimation

Approximation | Geometric Variance Reduction | JMLR 2006 | Variance Reduction |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers