Sciweavers

95 search results - page 6 / 19
» Policy Gradients for Cryptanalysis
Sort
View
150
Voted
NIPS
2008
15 years 6 months ago
Particle Filter-based Policy Gradient in POMDPs
Our setting is a Partially Observable Markov Decision Process with continuous state, observation and action spaces. Decisions are based on a Particle Filter for estimating the bel...
Pierre-Arnaud Coquelin, Romain Deguest, Rém...
129
Voted
ICRA
2005
IEEE
159views Robotics» more  ICRA 2005»
15 years 10 months ago
Learning Sensory Feedback to CPG with Policy Gradient for Biped Locomotion
— This paper proposes a learning framework for a CPG-based biped locomotion controller using a policy gradient method. Our goal in this study is to develop an efficient learning...
Takamitsu Matsubara, Jun Morimoto, Jun Nakanishi, ...
139
Voted
ICANN
2010
Springer
15 years 5 months ago
Multi-Dimensional Deep Memory Atari-Go Players for Parameter Exploring Policy Gradients
Abstract. Developing superior artificial board-game players is a widelystudied area of Artificial Intelligence. Among the most challenging games is the Asian game of Go, which, des...
Mandy Grüttner, Frank Sehnke, Tom Schaul, J&u...
112
Voted
IROS
2007
IEEE
123views Robotics» more  IROS 2007»
15 years 11 months ago
An extended policy gradient algorithm for robot task learning
Andrea Cherubini, Francesca Giannone, Luca Iocchi,...
102
Voted
AIPS
2007
15 years 7 months ago
FF + FPG: Guiding a Policy-Gradient Planner
Olivier Buffet, Douglas Aberdeen