Search Sciweavers | Sciweavers

16

ICANN
2010
Springer

164views Neural Networks» more ICANN 2010»

Multi-Dimensional Deep Memory Atari-Go Players for Parameter Exploring Policy Gradients

13 years 5 months ago

Abstract. Developing superior artificial board-game players is a widelystudied area of Artificial Intelligence. Among the most challenging games is the Asian game of Go, which, des...

Mandy Grüttner, Frank Sehnke, Tom Schaul, J&u...

claim paper

Read More »

15

click to vote

NECO
2010

97views more NECO 2010»

Derivatives of Logarithmic Stationary Distributions for Policy Gradient Reinforcement Learning

13 years 3 months ago

Download www.kyb.tuebingen.mpg.de

Most conventional Policy Gradient Reinforcement Learning (PGRL) algorithms neglect (or do not explicitly make use of) a term in the average reward gradient with respect to the pol...

Tetsuro Morimura, Eiji Uchibe, Junichiro Yoshimoto...

claim paper

Read More »

24

click to vote

AAAI
2011

145views Intelligent Agents» more AAAI 2011»

Policy Gradient Planning for Environmental Decision Making with Existing Simulators

12 years 4 months ago

Download www.cs.ubc.ca

In environmental and natural resource planning domains actions are taken at a large number of locations over multiple time periods. These problems have enormous state and action s...

Mark Crowley, David Poole

claim paper

Read More »

17

click to vote

NN
2010
Springer

125views Neural Networks» more NN 2010»

Parameter-exploring policy gradients

13 years 3 months ago

Download www.kyb.mpg.de

We present a model-free reinforcement learning method for partially observable Markov decision problems. Our method estimates a likelihood gradient by sampling directly in paramet...

Frank Sehnke, Christian Osendorfer, Thomas Rü...

claim paper

Read More »

24

click to vote

ICMLA
2010

203views Machine Learning» more ICMLA 2010»

Multimodal Parameter-exploring Policy Gradients

13 years 2 months ago

Download www6.in.tum.de

Abstract-- Policy Gradients with Parameter-based Exploration (PGPE) is a novel model-free reinforcement learning method that alleviates the problem of high-variance gradient estima...

Frank Sehnke, Alex Graves, Christian Osendorfer, J...

claim paper

Read More »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers