Multi-Dimensional Deep Memory Atari-Go Players for Parameter Exploring Policy Gradients

13 years 4 months ago

Download www.idsia.ch

Abstract. Developing superior artificial board-game players is a widelystudied area of Artificial Intelligence. Among the most challenging games is the Asian game of Go, which, despite its deceivingly simple rules, has eluded the development of artificial expert players. In this paper we attempt to tackle this challenge through a combination of two recent developments in Machine Learning. We employ Multi-Dimensional Recurrent Neural Networks with Long Short-Term Memory cells to handle the multi-dimensional data of the board game in a very natural way. In order to improve the convergence rate, as well as the ultimate performance, we train those networks using Policy Gradients with Parameter-based Exploration, a recently developed Reinforcement Learning algorithm which has been found to have numerous advantages over Evolution Strategies. Our empirical results confirm the promise of this approach, and we discuss how it can be scaled up to expert-level Go players.

Mandy Grüttner, Frank Sehnke, Tom Schaul, J&u

Real-time Traffic

Artificial Expert Players | ICANN 2010 | Neural Networks | Recurrent Neural Networks | Superior Artificial Board-game |

claim paper

Post Info
More Details (n/a)

Added	07 Dec 2010
Updated	07 Dec 2010
Type	Conference
Year	2010
Where	ICANN
Authors	Mandy Grüttner, Frank Sehnke, Tom Schaul, Jürgen Schmidhuber

Comments (0)

Sciweavers

Multi-Dimensional Deep Memory Atari-Go Players for Parameter Exploring Policy Gradients

Artificial Expert Players | ICANN 2010 | Neural Networks | Recurrent Neural Networks | Superior Artificial Board-game |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers