Abstract-- Policy Gradients with Parameter-based Exploration (PGPE) is a novel model-free reinforcement learning method that alleviates the problem of high-variance gradient estima...
Frank Sehnke, Alex Graves, Christian Osendorfer, J...
We use graphical models and structure learning to explore how people learn policies in sequential decision making tasks. Studies of sequential decision-making in humans frequently...
Solving in an efficient manner many different optimal control tasks within the same underlying environment requires decomposing the environment into its computationally elemental ...
To meet time constraints, a CBR system must control the time spent searching in the case base for a solution. In this paper, we presents the results of a case study comparing the p...
Locally weighted as well as Gaussian mixtures learning algorithms are suitable strategies for trajectory learning and skill acquisition, in the context of programming by demonstrat...