Sciweavers

162 search results - page 11 / 33
» Off-Policy Temporal Difference Learning with Function Approx...
Sort
View
UAI
2008
14 years 11 months ago
Dyna-Style Planning with Linear Function Approximation and Prioritized Sweeping
We consider the problem of efficiently learning optimal control policies and value functions over large state spaces in an online setting in which estimates must be available afte...
Richard S. Sutton, Csaba Szepesvári, Alborz...
ESANN
2001
14 years 10 months ago
Learning fault-tolerance in Radial Basis Function Networks
This paper describes a method of supervised learning based on forward selection branching. This method improves fault tolerance by means of combining information related to general...
Xavier Parra, Andreu Català
IJON
2006
90views more  IJON 2006»
14 years 9 months ago
Reinforcement learning of a simple control task using the spike response model
In this work, we propose a variation of a direct reinforcement learning algorithm, suitable for usage with spiking neurons based on the spike response model (SRM). The SRM is a bi...
Murilo Saraiva de Queiroz, Roberto Coelho de Berr&...
ICONIP
2009
14 years 7 months ago
Tracking in Reinforcement Learning
Reinforcement learning induces non-stationarity at several levels. Adaptation to non-stationary environments is of course a desired feature of a fair RL algorithm. Yet, even if the...
Matthieu Geist, Olivier Pietquin, Gabriel Fricout
FLAIRS
2003
14 years 10 months ago
Learning Opening Strategy in the Game of Go
In this paper, we present an experimental methodology and results for a machine learning approach to learning opening strategy in the game of Go, a game for which the best compute...
Timothy Huang, Graeme Connell, Bryan McQuade