Approximate policy iteration methods based on temporal differences are popular in practice, and have been tested extensively, dating to the early nineties, but the associated conve...
Distributed power control is an important issue in wireless networks. Recently, noncooperative game theory has been applied to investigate interesting solutions to this problem. Th...
Some of the most successful recent applications of reinforcement learning have used neural networks and the TD algorithm to learn evaluation functions. In this paper, we examine t...
— This paper presents an algorithm for adapting periodic behavior to gradual shifts in task parameters. Since learning optimal control in high dimensional domains is subject to t...
Dynamic Programming, Q-learning and other discrete Markov Decision Process solvers can be applied to continuous d-dimensional state-spaces by quantizing the state space into an arr...