We present four new reinforcement learning algorithms based on actor-critic and natural-gradient ideas, and provide their convergence proofs. Actor-critic reinforcement learning m...
Shalabh Bhatnagar, Richard S. Sutton, Mohammad Gha...
An evolutionary reinforcement-learning algorithm, the operation of which was not associated with an optimality condition, was instantiated in an artificial organism. The algorithm ...
Reinforcement Learning (RL) is the study of programs that improve their performance by receiving rewards and punishments from the environment. Most RL methods optimize the discoun...
Real-world behavior-based robot control problems require the coordination of a large number of competing behaviors. However, coordination becomes increasingly difficult as the num...
Abstract-- Policy Gradients with Parameter-based Exploration (PGPE) is a novel model-free reinforcement learning method that alleviates the problem of high-variance gradient estima...
Frank Sehnke, Alex Graves, Christian Osendorfer, J...