Search Sciweavers | Sciweavers

162 search results - page 7 / 33

» Off-Policy Temporal Difference Learning with Function Approx...

click to vote

ICML
1995
IEEE

155views Machine Learning» more ICML 1995»

Stable Function Approximation in Dynamic Programming

16 years 12 days ago

Download www.ri.cmu.edu

The success ofreinforcement learninginpractical problems depends on the ability to combine function approximation with temporal di erence methods such as value iteration. Experime...

Geoffrey J. Gordon

claim paper

Read More »

108

click to vote

NIPS
2008

165views Information Technology» more NIPS 2008»

Regularized Policy Iteration

15 years 1 months ago

Download webdocs.cs.ualberta.ca

In this paper we consider approximate policy-iteration-based reinforcement learning algorithms. In order to implement a flexible function approximation scheme we propose the use o...

Amir Massoud Farahmand, Mohammad Ghavamzadeh, Csab...

claim paper

Read More »

click to vote

CORR
2010
Springer

152views Education» more CORR 2010»

Neuroevolutionary optimization

14 years 11 months ago

Download jmlr.csail.mit.edu

Temporal difference methods are theoretically grounded and empirically effective methods for addressing reinforcement learning problems. In most real-world reinforcement learning ...

Eva Volná

claim paper

Read More »

102

click to vote

AAAI
2011

144views Intelligent Agents» more AAAI 2011»

Differential Eligibility Vectors for Advantage Updating and Gradient Methods

13 years 11 months ago

Download gaips.inesc-id.pt

In this paper we propose differential eligibility vectors (DEV) for temporal-difference (TD) learning, a new class of eligibility vectors designed to bring out the contribution of...

Francisco S. Melo

claim paper

Read More »

100

click to vote

ICML
2007
IEEE

180views Machine Learning» more ICML 2007»

Tracking value function dynamics to improve reinforcement learning with piecewise linear function approximation

16 years 12 days ago

Download www.machinelearning.org

Reinforcement learning algorithms can become unstable when combined with linear function approximation. Algorithms that minimize the mean-square Bellman error are guaranteed to co...

Chee Wee Phua, Robert Fitch

claim paper

Read More »

« Prev « First page 7 / 33 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers