Adaptive bases for Q-learning

10 years 1 months ago
Adaptive bases for Q-learning
Abstract-- We consider reinforcement learning, and in particular, the Q-learning algorithm in large state and action spaces. In order to cope with the size of the spaces, a function approximation approach to the state and action value function is needed. We generalize the classical Q-learning algorithm to an algorithm where the basis of the linear function approximation change dynamically while interacting with the environment. A motivation for such an approach is maximizing the state-action value function fitness to the problem faced, thus obtaining better performance. The algorithm is shown to converge using two time scales stochastic approximation. Finally, we discuss how this technique can be applied to a rich family of RL algorithms with linear function approximation.
Dotan Di Castro, Shie Mannor
Added 16 May 2011
Updated 16 May 2011
Type Journal
Year 2010
Where CDC
Authors Dotan Di Castro, Shie Mannor
Comments (0)