In this paper, we present a reinforcement learning approach for mapping natural language instructions to sequences of executable actions. We assume access to a reward function tha...
S. R. K. Branavan, Harr Chen, Luke S. Zettlemoyer,...
The aim of the Cyber Rodent project [1] is to elucidate the origin of our reward and affective systems by building artificial agents that share the natural biological constraints...
In this paper, we propose a novel adaptive step-size approach for policy gradient reinforcement learning. A new metric is defined for policy gradients that measures the effect of ...
Takamitsu Matsubara, Tetsuro Morimura, Jun Morimot...
There exist a number of reinforcement learning algorithms which learn by climbing the gradient of expected reward. Their long-run convergence has been proved, even in partially ob...
To gain insights into the neural basis of such adaptive decision-making processes, we investigated the nature of learning process in humans playing a competitive game with binary ...