Sciweavers

Free Online Productivity Tools i2Speak i2Symbol i2OCR iTex2Img iWeb2Print iWeb2Shot i2Type iPdf2Split iPdf2Merge i2Bopomofo i2Arabic i2Style i2Image i2PDF iLatex2Rtf Sci2ools

6

ECML
2004
Springer

favoriteEmaildiscussreport

77views Machine Learning» more ECML 2004»

Filtered Reinforcement Learning

13 years 10 months ago

Filtered Reinforcement Learning

Download eprints.pascal-network.org

Reinforcement learning (RL) algorithms attempt to assign the credit for rewards to the actions that contributed to the reward. Thus far, credit assignment has been done in one of two ways: uniformly, or using a discounting model that assigns exponentially more credit to recent actions. This paper demonstrates an alternative approach to temporal credit assignment, taking advantage of exact or approximate prior information about correct credit assignment. Inﬁnite impulse response (IIR) ﬁlters are used to model credit assignment information. IIR ﬁlters generalise exponentially discounting eligibility traces to arbitrary credit assignment models. This approach can be applied to any RL algorithm that employs an eligibility trace. The use of IIR credit assignment ﬁlters is explored using both the GPOMDP policy-gradient algorithm and the Sarsa(λ) temporal-diﬀerence algorithm. A drop in bias and variance of value or gradient estimates is demonstrated, resulting in faster convergence...

Douglas Aberdeen

Real-time Traffic

Credit Assignment | ECML 2004 | IIR Credit Assignment | Temporal Credit Assignment |

claim paper

Related Content

» Bayesian Reward Filtering

» Tracking value function dynamics to improve reinforcement learning with piecewise linear f...

» Learning defect classifiers for visual inspection images by neuroevolution using weakly la...

» Smoothed Sarsa Reinforcement learning for robot delivery tasks

» PeertoPeer Valuation as a Mechanism for Reinforcing Active Learning in Virtual Communities...

» Video Annotation Through Search and Graph Reinforcement Mining

» Efficient Reinforcement Learning Using Recursive LeastSquares Methods

» TeXDYNA Hierarchical Reinforcement Learning in Factored MDPs

» QConceptLearning Generalization with Concept Lattice Representation in Reinforcement Learn...

Post Info
More Details (n/a)

Added	01 Jul 2010
Updated	01 Jul 2010
Type	Conference
Year	2004
Where	ECML
Authors	Douglas Aberdeen

Comments (0)