Sciweavers

91 search results - page 1 / 19
» Parameter-exploring policy gradients
Sort
View
ICANN
2010
Springer
13 years 5 months ago
Multi-Dimensional Deep Memory Atari-Go Players for Parameter Exploring Policy Gradients
Abstract. Developing superior artificial board-game players is a widelystudied area of Artificial Intelligence. Among the most challenging games is the Asian game of Go, which, des...
Mandy Grüttner, Frank Sehnke, Tom Schaul, J&u...
NECO
2010
97views more  NECO 2010»
13 years 3 months ago
Derivatives of Logarithmic Stationary Distributions for Policy Gradient Reinforcement Learning
Most conventional Policy Gradient Reinforcement Learning (PGRL) algorithms neglect (or do not explicitly make use of) a term in the average reward gradient with respect to the pol...
Tetsuro Morimura, Eiji Uchibe, Junichiro Yoshimoto...
AAAI
2011
12 years 4 months ago
Policy Gradient Planning for Environmental Decision Making with Existing Simulators
In environmental and natural resource planning domains actions are taken at a large number of locations over multiple time periods. These problems have enormous state and action s...
Mark Crowley, David Poole
NN
2010
Springer
125views Neural Networks» more  NN 2010»
13 years 3 months ago
Parameter-exploring policy gradients
We present a model-free reinforcement learning method for partially observable Markov decision problems. Our method estimates a likelihood gradient by sampling directly in paramet...
Frank Sehnke, Christian Osendorfer, Thomas Rü...
ICMLA
2010
13 years 2 months ago
Multimodal Parameter-exploring Policy Gradients
Abstract-- Policy Gradients with Parameter-based Exploration (PGPE) is a novel model-free reinforcement learning method that alleviates the problem of high-variance gradient estima...
Frank Sehnke, Alex Graves, Christian Osendorfer, J...