Search Sciweavers | Sciweavers

102 search results - page 2 / 21

» Efficient Asymptotic Approximation in Temporal Difference Le...

click to vote

ICML
2009
IEEE

186views Machine Learning» more ICML 2009»

Regularization and feature selection in least-squares temporal difference learning

14 years 6 months ago

Download ai.stanford.edu

We consider the task of reinforcement learning with linear value function approximation. Temporal difference algorithms, and in particular the Least-Squares Temporal Difference (L...

J. Zico Kolter, Andrew Y. Ng

claim paper

Read More »

click to vote

ICML
2001
IEEE

185views Machine Learning» more ICML 2001»

Off-Policy Temporal Difference Learning with Function Approximation

14 years 6 months ago

Download www.cs.ualberta.ca

We introduce the first algorithm for off-policy temporal-difference learning that is stable with linear function approximation. Off-policy learning is of interest because it forms...

Doina Precup, Richard S. Sutton, Sanjoy Dasgupta

claim paper

Read More »

click to vote

ICML
2008
IEEE

165views Machine Learning» more ICML 2008»

A worst-case comparison between temporal difference and residual gradient with linear function approximation

14 years 6 months ago

Download www.research.rutgers.edu

Residual gradient (RG) was proposed as an alternative to TD(0) for policy evaluation when function approximation is used, but there exists little formal analysis comparing them ex...

Lihong Li

claim paper

Read More »

click to vote

ML
2002
ACM

154views Machine Learning» more ML 2002»

Technical Update: Least-Squares Temporal Difference Learning

13 years 4 months ago

Download www.research.rutgers.edu

TD() is a popular family of algorithms for approximate policy evaluation in large MDPs. TD() works by incrementally updating the value function after each observed transition. It h...

Justin A. Boyan

claim paper

Read More »

click to vote

ICML
2003
IEEE

150views Machine Learning» more ICML 2003»

The Significance of Temporal-Difference Learning in Self-Play Training TD-Rummy versus EVO-rummy

13 years 10 months ago

Download www.hpl.hp.com

Reinforcement learning has been used for training game playing agents. The value function for a complex game must be approximated with a continuous function because the number of ...

Clifford Kotnik, Jugal K. Kalita

claim paper

Read More »

« Prev « First page 2 / 21 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers