Log-linear and maximum-margin models are two commonly-used methods in supervised machine learning, and are frequently used in structured prediction problems. Efficient learning of...
Michael Collins, Amir Globerson, Terry Koo, Xavier...
Natural policy gradient methods and the covariance matrix adaptation evolution strategy, two variable metric methods proposed for solving reinforcement learning tasks, are contrast...
— The aquisition and improvement of motor skills and control policies for robotics from trial and error is of essential importance if robots should ever leave precisely pre-struc...
We present four new reinforcement learning algorithms based on actor-critic and natural-gradient ideas, and provide their convergence proofs. Actor-critic reinforcement learning m...
Shalabh Bhatnagar, Richard S. Sutton, Mohammad Gha...
For this special session of EU projects in the area of NeuroIT, we will review the progress of the MirrorBot project with special emphasis on its relation to reinforcement learning...
Cornelius Weber, David Muse, Mark Elshaw, Stefan W...