In this paper we present an approach for reducing the memory footprint requirement of temporal difference methods in which the set of states is finite. We use case-based generaliza...
Reinforcement learning (RL) methods have become popular in recent years because of their ability to solve complex tasks with minimal feedback. Both genetic algorithms (GAs) and te...
The game of Go has a high branching factor that defeats the tree search approach used in computer chess, and long-range spatiotemporal interactions that make position evaluation e...
Nicol N. Schraudolph, Peter Dayan, Terrence J. Sej...
We provide an analytical comparison between discounted and average reward temporal-difference (TD) learning with linearly parameterized approximations. We first consider the asympt...