We consider the problem of efficiently learning optimal control policies and value functions over large state spaces in an online setting in which estimates must be available afte...
This paper describes a method of supervised learning based on forward selection branching. This method improves fault tolerance by means of combining information related to general...
In this work, we propose a variation of a direct reinforcement learning algorithm, suitable for usage with spiking neurons based on the spike response model (SRM). The SRM is a bi...
Murilo Saraiva de Queiroz, Roberto Coelho de Berr&...
Reinforcement learning induces non-stationarity at several levels. Adaptation to non-stationary environments is of course a desired feature of a fair RL algorithm. Yet, even if the...
In this paper, we present an experimental methodology and results for a machine learning approach to learning opening strategy in the game of Go, a game for which the best compute...