Sciweavers

ICML
2006
IEEE

Automatic basis function construction for approximate dynamic programming and reinforcement learning

13 years 10 months ago
Automatic basis function construction for approximate dynamic programming and reinforcement learning
We address the problem of automatically constructing basis functions for linear approximation of the value function of a Markov Decision Process (MDP). Our work builds on results by Bertsekas and Casta˜non (1989) who proposed a method for automatically aggregating states to speed up value iteration. We propose to use neighborhood component analysis (Goldberger et al., 2005), a dimensionality reduction technique created for supervised learning, in order to map a high-dimensional state space to a lowdimensional space, based on the Bellman error, or on the temporal difference (TD) error. We then place basis function in the lower-dimensional space. These are added as new features for the linear function approximator. This approach is applied to a high-dimensional inventory control problem.
Philipp W. Keller, Shie Mannor, Doina Precup
Added 13 Jun 2010
Updated 13 Jun 2010
Type Conference
Year 2006
Where ICML
Authors Philipp W. Keller, Shie Mannor, Doina Precup
Comments (0)