Sciweavers

133 search results - page 4 / 27
» Hierarchical Policy Gradient Algorithms
Sort
View
ICMLA
2010
13 years 3 months ago
Multimodal Parameter-exploring Policy Gradients
Abstract-- Policy Gradients with Parameter-based Exploration (PGPE) is a novel model-free reinforcement learning method that alleviates the problem of high-variance gradient estima...
Frank Sehnke, Alex Graves, Christian Osendorfer, J...
ICRA
2005
IEEE
159views Robotics» more  ICRA 2005»
13 years 11 months ago
Learning Sensory Feedback to CPG with Policy Gradient for Biped Locomotion
— This paper proposes a learning framework for a CPG-based biped locomotion controller using a policy gradient method. Our goal in this study is to develop an efficient learning...
Takamitsu Matsubara, Jun Morimoto, Jun Nakanishi, ...
AAAI
2010
13 years 7 months ago
Multi-Agent Learning with Policy Prediction
Due to the non-stationary environment, learning in multi-agent systems is a challenging problem. This paper first introduces a new gradient-based learning algorithm, augmenting th...
Chongjie Zhang, Victor R. Lesser
IROS
2006
IEEE
113views Robotics» more  IROS 2006»
13 years 11 months ago
Policy Gradient Methods for Robotics
— The aquisition and improvement of motor skills and control policies for robotics from trial and error is of essential importance if robots should ever leave precisely pre-struc...
Jan Peters, Stefan Schaal
ICANN
2010
Springer
13 years 6 months ago
Multi-Dimensional Deep Memory Atari-Go Players for Parameter Exploring Policy Gradients
Abstract. Developing superior artificial board-game players is a widelystudied area of Artificial Intelligence. Among the most challenging games is the Asian game of Go, which, des...
Mandy Grüttner, Frank Sehnke, Tom Schaul, J&u...