Sciweavers

114 search results - page 2 / 23
» Temporal Difference Updating without a Learning Rate
Sort
View
AAAI
2011
12 years 5 months ago
Differential Eligibility Vectors for Advantage Updating and Gradient Methods
In this paper we propose differential eligibility vectors (DEV) for temporal-difference (TD) learning, a new class of eligibility vectors designed to bring out the contribution of...
Francisco S. Melo
ML
2000
ACM
126views Machine Learning» more  ML 2000»
13 years 5 months ago
Learning to Play Chess Using Temporal Differences
In this paper we present TDLEAF( ), a variation on the TD( ) algorithm that enables it to be used in conjunction with game-tree search. We present some experiments in which our che...
Jonathan Baxter, Andrew Tridgell, Lex Weaver
ICML
2006
IEEE
14 years 6 months ago
Relational temporal difference learning
We introduce relational temporal difference learning as an effective approach to solving multi-agent Markov decision problems with large state spaces. Our algorithm uses temporal ...
Nima Asgharbeygi, David J. Stracuzzi, Pat Langley
ICML
1999
IEEE
14 years 6 months ago
Least-Squares Temporal Difference Learning
Excerpted from: Boyan, Justin. Learning Evaluation Functions for Global Optimization. Ph.D. thesis, Carnegie Mellon University, August 1998. (Available as Technical Report CMU-CS-...
Justin A. Boyan
NIPS
2008
13 years 6 months ago
Temporal Difference Based Actor Critic Learning - Convergence and Neural Implementation
Actor-critic algorithms for reinforcement learning are achieving renewed popularity due to their good convergence properties in situations where other approaches often fail (e.g.,...
Dotan Di Castro, Dmitry Volkinshtein, Ron Meir