Search Sciweavers | Sciweavers

This paper presents postponed updates, a new strategy for TD methods that can improve sample efﬁciency without incurring the computational and space requirements of model-based ...

Harm van Seijen, Shimon Whiteson

claim paper

Read More »

119

click to vote

COLT
2000
Springer

121views Machine Learning» more COLT 2000»

Bias-Variance Error Bounds for Temporal Difference Updates

15 years 7 months ago

Download www.cis.upenn.edu

We give the ﬁrst rigorous upper bounds on the error of temporal difference (td) algorithms for policy evaluation as a function of the amount of experience. These upper bounds pr...

Michael J. Kearns, Satinder P. Singh

claim paper

Read More »

130

click to vote

ML
2002
ACM

154views Machine Learning» more ML 2002»

Technical Update: Least-Squares Temporal Difference Learning

15 years 2 months ago

Download www.research.rutgers.edu

TD() is a popular family of algorithms for approximate policy evaluation in large MDPs. TD() works by incrementally updating the value function after each observed transition. It h...

Justin A. Boyan

claim paper

Read More »

145

click to vote

ICML
2010
IEEE

222views Machine Learning» more ICML 2010»

Temporal Difference Bayesian Model Averaging: A Bayesian Perspective on Adapting Lambda

15 years 1 months ago

Download www.icml2010.org

Temporal difference (TD) algorithms are attractive for reinforcement learning due to their ease-of-implementation and use of "bootstrapped" return estimates to make effi...

Carlton Downey, Scott Sanner

claim paper

Read More »

« Prev « First page 1 / 23 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers