Sciweavers

163 search results - page 5 / 33
» Policy Gradient Methods for Robotics
Sort
View
ICANN
2007
Springer
15 years 3 months ago
Solving Deep Memory POMDPs with Recurrent Policy Gradients
Abstract. This paper presents Recurrent Policy Gradients, a modelfree reinforcement learning (RL) method creating limited-memory stochastic policies for partially observable Markov...
Daan Wierstra, Alexander Förster, Jan Peters,...
84
Voted
ICMLA
2010
14 years 7 months ago
Multimodal Parameter-exploring Policy Gradients
Abstract-- Policy Gradients with Parameter-based Exploration (PGPE) is a novel model-free reinforcement learning method that alleviates the problem of high-variance gradient estima...
Frank Sehnke, Alex Graves, Christian Osendorfer, J...
NIPS
2008
14 years 11 months ago
Particle Filter-based Policy Gradient in POMDPs
Our setting is a Partially Observable Markov Decision Process with continuous state, observation and action spaces. Decisions are based on a Particle Filter for estimating the bel...
Pierre-Arnaud Coquelin, Romain Deguest, Rém...
JMLR
2006
124views more  JMLR 2006»
14 years 9 months ago
Policy Gradient in Continuous Time
Policy search is a method for approximately solving an optimal control problem by performing a parametric optimization search in a given class of parameterized policies. In order ...
Rémi Munos
AIPS
2007
14 years 12 months ago
Concurrent Probabilistic Temporal Planning with Policy-Gradients
We present an any-time concurrent probabilistic temporal planner that includes continuous and discrete uncertainties and metric functions. Our approach is a direct policy search t...
Douglas Aberdeen, Olivier Buffet