The problem of reinforcement learning in large factored Markov decision processes is explored. The Q-value of a state-action pair is approximated by the free energy of a product o...
There are many applications in which it is desirable to order rather than classify instances. Here we consider the problem of learning how to order, given feedback in the form of ...
William W. Cohen, Robert E. Schapire, Yoram Singer
We consider the problem of discovering a smooth unknown surface S bounding an object O in R3 . The discovery process consists of moving a point probing device in the free space ar...
Jean-Daniel Boissonnat, Leonidas J. Guibas, Steve ...
Recent work in transfer learning has succeeded in making reinforcement learning algorithms more efficient by incorporating knowledge from previous tasks. However, such methods typ...
We consider the task of reinforcement learning with linear value function approximation. Temporal difference algorithms, and in particular the Least-Squares Temporal Difference (L...