Sciweavers

114 search results - page 1 / 23
» Temporal Difference Updating without a Learning Rate
Sort
View
49
Voted
NIPS
2007
15 years 2 months ago
Temporal Difference Updating without a Learning Rate
Marcus Hutter, Shane Legg
123
Voted
ISDA
2009
IEEE
15 years 7 months ago
Postponed Updates for Temporal-Difference Reinforcement Learning
This paper presents postponed updates, a new strategy for TD methods that can improve sample efficiency without incurring the computational and space requirements of model-based ...
Harm van Seijen, Shimon Whiteson
100
Voted
COLT
2000
Springer
15 years 5 months ago
Bias-Variance Error Bounds for Temporal Difference Updates
We give the first rigorous upper bounds on the error of temporal difference (td) algorithms for policy evaluation as a function of the amount of experience. These upper bounds pr...
Michael J. Kearns, Satinder P. Singh
ML
2002
ACM
154views Machine Learning» more  ML 2002»
15 years 23 days ago
Technical Update: Least-Squares Temporal Difference Learning
TD() is a popular family of algorithms for approximate policy evaluation in large MDPs. TD() works by incrementally updating the value function after each observed transition. It h...
Justin A. Boyan
120
Voted
ICML
2010
IEEE
14 years 11 months ago
Temporal Difference Bayesian Model Averaging: A Bayesian Perspective on Adapting Lambda
Temporal difference (TD) algorithms are attractive for reinforcement learning due to their ease-of-implementation and use of "bootstrapped" return estimates to make effi...
Carlton Downey, Scott Sanner