In reinforcement learning, least-squares temporal difference methods (e.g., LSTD and LSPI) are effective, data-efficient techniques for policy evaluation and control with linear v...
Michael H. Bowling, Alborz Geramifard, David Winga...
XCS is a learning classifier system that combines a reinforcement learning scheme with evolutionary algorithms to evolve rule sets on-line by means of the interaction with an envi...
Sergio Morales-Ortigosa, Albert Orriols-Puig, Este...
Dynamic scripting is a reinforcement learning algorithm designed specifically to learn appropriate tactics for an agent in a modern computer game, such as Neverwinter Nights. This...
This chapter presents a generic internal reward system that drives an agent to increase the complexity of its behavior. This reward system does not reinforce a predefined task. It...
We introduce a class of learning problems where the agent is presented with a series of tasks. Intuitively, if there is relation among those tasks, then the information gained duri...