Sciweavers

114 search results - page 2 / 23
» Temporal Difference Updating without a Learning Rate
Sort
View
AAAI
2011
13 years 9 months ago
Differential Eligibility Vectors for Advantage Updating and Gradient Methods
In this paper we propose differential eligibility vectors (DEV) for temporal-difference (TD) learning, a new class of eligibility vectors designed to bring out the contribution of...
Francisco S. Melo
ML
2000
ACM
126views Machine Learning» more  ML 2000»
14 years 9 months ago
Learning to Play Chess Using Temporal Differences
In this paper we present TDLEAF( ), a variation on the TD( ) algorithm that enables it to be used in conjunction with game-tree search. We present some experiments in which our che...
Jonathan Baxter, Andrew Tridgell, Lex Weaver
ICML
2006
IEEE
15 years 10 months ago
Relational temporal difference learning
We introduce relational temporal difference learning as an effective approach to solving multi-agent Markov decision problems with large state spaces. Our algorithm uses temporal ...
Nima Asgharbeygi, David J. Stracuzzi, Pat Langley
ICML
1999
IEEE
15 years 10 months ago
Least-Squares Temporal Difference Learning
Excerpted from: Boyan, Justin. Learning Evaluation Functions for Global Optimization. Ph.D. thesis, Carnegie Mellon University, August 1998. (Available as Technical Report CMU-CS-...
Justin A. Boyan
NIPS
2008
14 years 11 months ago
Temporal Difference Based Actor Critic Learning - Convergence and Neural Implementation
Actor-critic algorithms for reinforcement learning are achieving renewed popularity due to their good convergence properties in situations where other approaches often fail (e.g.,...
Dotan Di Castro, Dmitry Volkinshtein, Ron Meir