Sciweavers

Free Online Productivity Tools i2Speak i2Symbol i2OCR iTex2Img iWeb2Print iWeb2Shot i2Type iPdf2Split iPdf2Merge i2Bopomofo i2Arabic i2Style i2Image i2PDF iLatex2Rtf Sci2ools

12

ESANN
2006

favoriteEmaildiscussreport

114views Neural Networks» more ESANN 2006»

Reducing policy degradation in neuro-dynamic programming

13 years 6 months ago

Reducing policy degradation in neuro-dynamic programming

Download ml.informatik.uni-freiburg.de

We focus on neuro-dynamic programming methods to learn state-action value functions and outline some of the inherent problems to be faced, when performing reinforcement learning in combination with function approximation. In an attempt to overcome some of these problems, we develop a reinforcement learning method that monitors the learning process, enables the learner to reflect whether it is better to cease learning, and thus obtains more stable learning results.

Thomas Gabel, Martin Riedmiller

Real-time Traffic

ESANN 2006 | ESANN 2007 | Reinforcement Learning | Reinforcement Learning Method | State-action Value Functions |

claim paper

Related Content

» Firewall Compressor An Algorithm for Minimizing Firewall Policies

» Static and Dynamic TemperatureAware Scheduling for Multiprocessor SoCs

» SoftOLP Improving Hardware Cache Performance through SoftwareControlled ObjectLevel Partit...

» MemScale active lowpower modes for main memory

» Transparent Threads Resource Sharing in SMT Processors for High SingleThread Performance

Post Info
More Details (n/a)

Added	31 Oct 2010
Updated	31 Oct 2010
Type	Conference
Year	2006
Where	ESANN
Authors	Thomas Gabel, Martin Riedmiller

Comments (0)