Search Sciweavers | Sciweavers

49 search results - page 3 / 10

» Model-based Policy Gradient Reinforcement Learning

click to vote

ICMLA
2010

203views Machine Learning» more ICMLA 2010»

Multimodal Parameter-exploring Policy Gradients

13 years 3 months ago

Download www6.in.tum.de

Abstract-- Policy Gradients with Parameter-based Exploration (PGPE) is a novel model-free reinforcement learning method that alleviates the problem of high-variance gradient estima...

Frank Sehnke, Alex Graves, Christian Osendorfer, J...

claim paper

Read More »

click to vote

ECML
2005
Springer

193views Machine Learning» more ECML 2005»

Natural Actor-Critic

13 years 11 months ago

Download www-clmc.usc.edu

This paper investigates a novel model-free reinforcement learning architecture, the Natural Actor-Critic. The actor updates are based on stochastic policy gradients employing Amari...

Jan Peters, Sethu Vijayakumar, Stefan Schaal

claim paper

Read More »

click to vote

NIPS
2001

144views Information Technology» more NIPS 2001»

Variance Reduction Techniques for Gradient Estimates in Reinforcement Learning

13 years 7 months ago

Download jmlr.csail.mit.edu

Policy gradient methods for reinforcement learning avoid some of the undesirable properties of the value function approaches, such as policy degradation (Baxter and Bartlett, 2001...

Evan Greensmith, Peter L. Bartlett, Jonathan Baxte...

claim paper

Read More »

click to vote

NIPS
1998

140views Information Technology» more NIPS 1998»

Gradient Descent for General Reinforcement Learning

13 years 7 months ago

Download www.ri.cmu.edu

A simple learning rule is derived, the VAPS algorithm, which can be instantiated to generate a wide range of new reinforcementlearning algorithms. These algorithms solve a number ...

Leemon C. Baird III, Andrew W. Moore

claim paper

Read More »

click to vote

AAAI
2010

191views Intelligent Agents» more AAAI 2010»

Relative Entropy Policy Search

13 years 7 months ago

Download www.kyb.tuebingen.mpg.de

Policy search is a successful approach to reinforcement learning. However, policy improvements often result in the loss of information. Hence, it has been marred by premature conv...

Jan Peters, Katharina Mülling, Yasemin Altun

claim paper

Read More »

« Prev « First page 3 / 10 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers