policy gradient | Sciweavers

14

NIPS
2008

110views Information Technology» more NIPS 2008»

Signal-to-Noise Ratio Analysis of Policy Gradient Algorithms

13 years 6 months ago

Policy gradient (PG) reinforcement learning algorithms have strong (local) convergence guarantees, but their learning performance is typically limited by a large variance in the e...

John W. Roberts, Russ Tedrake

claim paper

Read More »

14

click to vote

AAAI
2010

191views Intelligent Agents» more AAAI 2010»

Relative Entropy Policy Search

13 years 6 months ago

Download www.kyb.tuebingen.mpg.de

Policy search is a successful approach to reinforcement learning. However, policy improvements often result in the loss of information. Hence, it has been marred by premature conv...

Jan Peters, Katharina Mülling, Yasemin Altun

claim paper

Read More »

14

click to vote

ECML
2007
Springer

192views Machine Learning» more ECML 2007»

Policy Gradient Critics

13 years 10 months ago

Download www.idsia.ch

We present Policy Gradient Actor-Critic (PGAC), a new model-free Reinforcement Learning (RL) method for creating limited-memory stochastic policies for Partially Observable Markov ...

Daan Wierstra, Jürgen Schmidhuber

claim paper

Read More »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers