Abstract. Both self-learning architecture (embedded structure) and explicit/implicit teaching from other agents (environmental design issue) are necessary not only for one behavior...
Temporal difference (TD) learning has been used to learn strong evaluation functions in a variety of two-player games. TD-gammon illustrated how the combination of game tree search...
In this paper, we present a reinforcement learning approach for mapping natural language instructions to sequences of executable actions. We assume access to a reward function tha...
S. R. K. Branavan, Harr Chen, Luke S. Zettlemoyer,...
This paper describes our study into the concept of using rewards in a classifier system applied to the acquisition of decision-making algorithms for agents in a soccer game. Our a...
Learning in real-world domains often requires to deal with continuous state and action spaces. Although many solutions have been proposed to apply Reinforcement Learning algorithm...
Alessandro Lazaric, Marcello Restelli, Andrea Bona...