Sciweavers

5 search results - page 1 / 1
» Solving Deep Memory POMDPs with Recurrent Policy Gradients
Sort
View
ICANN
2007
Springer
13 years 11 months ago
Solving Deep Memory POMDPs with Recurrent Policy Gradients
Abstract. This paper presents Recurrent Policy Gradients, a modelfree reinforcement learning (RL) method creating limited-memory stochastic policies for partially observable Markov...
Daan Wierstra, Alexander Förster, Jan Peters,...
GECCO
2005
Springer
155views Optimization» more  GECCO 2005»
13 years 10 months ago
Co-evolving recurrent neurons learn deep memory POMDPs
Recurrent neural networks are theoretically capable of learning complex temporal sequences, but training them through gradient-descent is too slow and unstable for practical use i...
Faustino J. Gomez, Jürgen Schmidhuber
ICANN
2010
Springer
13 years 5 months ago
Multi-Dimensional Deep Memory Atari-Go Players for Parameter Exploring Policy Gradients
Abstract. Developing superior artificial board-game players is a widelystudied area of Artificial Intelligence. Among the most challenging games is the Asian game of Go, which, des...
Mandy Grüttner, Frank Sehnke, Tom Schaul, J&u...
ECML
2007
Springer
13 years 11 months ago
Policy Gradient Critics
We present Policy Gradient Actor-Critic (PGAC), a new model-free Reinforcement Learning (RL) method for creating limited-memory stochastic policies for Partially Observable Markov ...
Daan Wierstra, Jürgen Schmidhuber
JMLR
2010
227views more  JMLR 2010»
13 years 3 months ago
PyBrain
PyBrain is a versatile machine learning library for Python. Its goal is to provide flexible, easyto-use yet still powerful algorithms for machine learning tasks, including a vari...
Tom Schaul, Justin Bayer, Daan Wierstra, Yi Sun, M...