Sciweavers

NIPS
1994

Generalization in Reinforcement Learning: Safely Approximating the Value Function

13 years 5 months ago
Generalization in Reinforcement Learning: Safely Approximating the Value Function
To appear in: G. Tesauro, D. S. Touretzky and T. K. Leen, eds., Advances in Neural Information Processing Systems 7, MIT Press, Cambridge MA, 1995. A straightforward approach to the curse of dimensionality in reinforcement learning and dynamic programming is to replace the lookuptable witha generalizing function approximatorsuch as a neural net. Although this has been successful in the domain of backgammon, there is no guarantee of convergence. In this paper, we show that the combinationof dynamic programming and function approximation is not robust, and in even very benign cases, may produce an entirely wrong policy. We then introduce Grow-Support, a new algorithmwhich is safe from divergence yet can still reap the bene ts of successful generalization.
Justin A. Boyan, Andrew W. Moore
Added 02 Nov 2010
Updated 02 Nov 2010
Type Conference
Year 1994
Where NIPS
Authors Justin A. Boyan, Andrew W. Moore
Comments (0)