In this paper we introduce the first algorithms for efficiently learning a simulation policy for Monte-Carlo search. Our main idea is to optimise the balance of a simulation polic...
In this paper, we address the task of mapping high-level instructions to sequences of commands in an external environment. Processing these instructions is challenging--they posit...
S. R. K. Branavan, Luke S. Zettlemoyer, Regina Bar...
This paper describes a novel method by which a dialogue agent can learn to choose an optimal dialogue strategy. While it is widely agreed that dialogue strategies should be formul...
Marilyn A. Walker, Jeanne Frommer, Shrikanth Naray...
We investigate the possibility to apply a known machine learning algorithm of Q-learning in the domain of a Virtual Learning Environment (VLE). It is important in this problem doma...
In this paper, we study the relationship between the two techniques known as ant colony optimization (aco) and stochastic gradient descent. More precisely, we show that some empir...