Sciweavers

ML
2002
ACM

A Sparse Sampling Algorithm for Near-Optimal Planning in Large Markov Decision Processes

13 years 4 months ago
A Sparse Sampling Algorithm for Near-Optimal Planning in Large Markov Decision Processes
An issue that is critical for the application of Markov decision processes MDPs to realistic problems is how the complexity of planning scales with the size of the MDP. In stochastic environments with very large or even in nite state spaces, traditional planning and reinforcementlearningalgorithmsare ofteninapplicable, since their running time typically scales linearly with the state space size in the worst case. In this paper we present a new algorithm that, given only a generative model simulator for an arbitrary MDP, performs near-optimal planning with a running time that has no dependence on the number of states. Although the running time is exponential in the horizon time which depends only on the discount factor and the desired degree of approximation to the optimal policy, our results establish for the rst time that there are no theoretical barriers to computing near-optimal policies in arbitrarily large, unstructured MDPs.
Michael J. Kearns, Yishay Mansour, Andrew Y. Ng
Added 22 Dec 2010
Updated 22 Dec 2010
Type Journal
Year 2002
Where ML
Authors Michael J. Kearns, Yishay Mansour, Andrew Y. Ng
Comments (0)