Sciweavers

162 search results - page 4 / 33
» Off-Policy Temporal Difference Learning with Function Approx...
Sort
View
ECAI
2000
Springer
15 years 1 months ago
Efficient Asymptotic Approximation in Temporal Difference Learning
Abstract. TD(
Frédérick Garcia, Florent Serre
ML
2000
ACM
126views Machine Learning» more  ML 2000»
14 years 9 months ago
Learning to Play Chess Using Temporal Differences
In this paper we present TDLEAF( ), a variation on the TD( ) algorithm that enables it to be used in conjunction with game-tree search. We present some experiments in which our che...
Jonathan Baxter, Andrew Tridgell, Lex Weaver
103
Voted
ICML
2006
IEEE
15 years 3 months ago
Automatic basis function construction for approximate dynamic programming and reinforcement learning
We address the problem of automatically constructing basis functions for linear approximation of the value function of a Markov Decision Process (MDP). Our work builds on results ...
Philipp W. Keller, Shie Mannor, Doina Precup
JMLR
2010
119views more  JMLR 2010»
14 years 4 months ago
A Convergent Online Single Time Scale Actor Critic Algorithm
Actor-Critic based approaches were among the first to address reinforcement learning in a general setting. Recently, these algorithms have gained renewed interest due to their gen...
Dotan Di Castro, Ron Meir