Sciweavers

162 search results - page 4 / 33
» Off-Policy Temporal Difference Learning with Function Approx...
Sort
View
ECAI
2000
Springer
13 years 9 months ago
Efficient Asymptotic Approximation in Temporal Difference Learning
Abstract. TD(
Frédérick Garcia, Florent Serre
ML
2000
ACM
126views Machine Learning» more  ML 2000»
13 years 5 months ago
Learning to Play Chess Using Temporal Differences
In this paper we present TDLEAF( ), a variation on the TD( ) algorithm that enables it to be used in conjunction with game-tree search. We present some experiments in which our che...
Jonathan Baxter, Andrew Tridgell, Lex Weaver
ICML
2006
IEEE
13 years 12 months ago
Automatic basis function construction for approximate dynamic programming and reinforcement learning
We address the problem of automatically constructing basis functions for linear approximation of the value function of a Markov Decision Process (MDP). Our work builds on results ...
Philipp W. Keller, Shie Mannor, Doina Precup
JMLR
2010
119views more  JMLR 2010»
13 years 23 days ago
A Convergent Online Single Time Scale Actor Critic Algorithm
Actor-Critic based approaches were among the first to address reinforcement learning in a general setting. Recently, these algorithms have gained renewed interest due to their gen...
Dotan Di Castro, Ron Meir