Sciweavers

ESANN
2006
13 years 6 months ago
Reducing policy degradation in neuro-dynamic programming
We focus on neuro-dynamic programming methods to learn state-action value functions and outline some of the inherent problems to be faced, when performing reinforcement learning in...
Thomas Gabel, Martin Riedmiller