In reinforcement learning, least-squares temporal difference methods (e.g., LSTD and LSPI) are effective, data-efficient techniques for policy evaluation and control with linear v...
Michael H. Bowling, Alborz Geramifard, David Winga...
XCS is a learning classifier system that combines a reinforcement learning scheme with evolutionary algorithms to evolve rule sets on-line by means of the interaction with an envi...
Sergio Morales-Ortigosa, Albert Orriols-Puig, Este...
Dynamic scripting is a reinforcement learning algorithm designed specifically to learn appropriate tactics for an agent in a modern computer game, such as Neverwinter Nights. This...
This chapter presents a generic internal reward system that drives an agent to increase the complexity of its behavior. This reward system does not reinforce a predefined task. It...
— In this paper, we present a novel approach to controlling a robotic system online from scratch based on the reinforcement learning principle. In contrast to other approaches, o...