Search Sciweavers | Sciweavers

121 search results - page 2 / 25

» Investigating practical, linear temporal difference learning

click to vote

ML
2002
ACM

154views Machine Learning» more ML 2002»

Technical Update: Least-Squares Temporal Difference Learning

13 years 4 months ago

Download www.research.rutgers.edu

TD() is a popular family of algorithms for approximate policy evaluation in large MDPs. TD() works by incrementally updating the value function after each observed transition. It h...

Justin A. Boyan

claim paper

Read More »

click to vote

ML
2002
ACM

168views Machine Learning» more ML 2002»

On Average Versus Discounted Reward Temporal-Difference Learning

13 years 4 months ago

Download web.mit.edu

We provide an analytical comparison between discounted and average reward temporal-difference (TD) learning with linearly parameterized approximations. We first consider the asympt...

John N. Tsitsiklis, Benjamin Van Roy

claim paper

Read More »

click to vote

ICML
2009
IEEE

186views Machine Learning» more ICML 2009»

Regularization and feature selection in least-squares temporal difference learning

14 years 5 months ago

Download ai.stanford.edu

We consider the task of reinforcement learning with linear value function approximation. Temporal difference algorithms, and in particular the Least-Squares Temporal Difference (L...

J. Zico Kolter, Andrew Y. Ng

claim paper

Read More »

click to vote

ML
2000
ACM

126views Machine Learning» more ML 2000»

Learning to Play Chess Using Temporal Differences

13 years 4 months ago

Download www.cs.princeton.edu

In this paper we present TDLEAF( ), a variation on the TD( ) algorithm that enables it to be used in conjunction with game-tree search. We present some experiments in which our che...

Jonathan Baxter, Andrew Tridgell, Lex Weaver

claim paper

Read More »

click to vote

ICML
2010
IEEE

219views Machine Learning» more ICML 2010»

Convergence of Least Squares Temporal Difference Methods Under General Conditions

13 years 6 months ago

Download www.cs.helsinki.fi

We consider approximate policy evaluation for finite state and action Markov decision processes (MDP) in the off-policy learning context and with the simulation-based least square...

Huizhen Yu

claim paper

Read More »

« Prev « First page 2 / 25 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers