In reinforcement learning, least-squares temporal difference methods (e.g., LSTD and LSPI) are effective, data-efficient techniques for policy evaluation and control with linear v...
Michael H. Bowling, Alborz Geramifard, David Winga...
In many practical reinforcement learning problems, the state space is too large to permit an exact representation of the value function, much less the time required to compute it. ...
This paper presents a novel framework called proto-reinforcement learning (PRL), based on a mathematical model of a proto-value function: these are task-independent basis function...
Reinforcement learning has been used for training game playing agents. The value function for a complex game must be approximated with a continuous function because the number of ...
We present a reinforcement learning game player that can interact with a General Game Playing system and transfer knowledge learned in one game to expedite learning in many other ...