Co-Evolution in the Successful Learning of Backgammon Strategy

15 years 5 months ago

Download www.demo.cs.brandeis.edu

Following Tesauro’s work on TD-Gammon, we used a 4000 parameter feed-forward neural network to develop a competitive backgammon evaluation function. Play proceeds by a roll of the dice, application of the network to all legal moves, and choosing the move with the highest evaluation. However, no back-propagation, reinforcement or temporal difference learning methods were employed. Instead we apply simple hill-climbing in a relative ﬁtness environment. We start with an initial champion of all zero weights and proceed simply by playing the current champion network against a slightly mutated challenger and changing weights if the challenger wins. Surprisingly, this worked rather well. We investigate how the peculiar dynamics of this domain enabled a previously discarded weak method to succeed, by preventing suboptimal equilibria in a “meta-game” of self-learning.

Jordan B. Pollack, Alan D. Blair

Real-time Traffic

Backgammon Evaluation Function | Feed-forward Neural Network | Machine Learning | ML 1998 | Temporal Difference Learning |

claim paper

Post Info
More Details (n/a)

Added	22 Dec 2010
Updated	22 Dec 2010
Type	Journal
Year	1998
Where	ML
Authors	Jordan B. Pollack, Alan D. Blair

Comments (0)

Sciweavers

Co-Evolution in the Successful Learning of Backgammon Strategy

Backgammon Evaluation Function | Feed-forward Neural Network | Machine Learning | ML 1998 | Temporal Difference Learning |

Explore & Download

Productivity Tools

Sciweavers