Sciweavers

ICANN
1997
Springer

On Learning Soccer Strategies

13 years 10 months ago
On Learning Soccer Strategies
We use simulated soccer to study multiagent learning. Each team's players (agents) share action set and policy but may behave differently due to position-dependent inputs. All agents making up a team are rewarded or punished collectively in case of goals. We conduct simulations with varying team sizes, and compare two learning algorithms: TD-Q learning with linear neural networks (TD-Q) and Probabilistic Incremental Program Evolution (PIPE). TD-Q is based on evaluation functions (EFs) mapping input/action pairs to expected reward, while PIPE searches policy space directly. PIPE uses an adaptive probability distribution to synthesize programs that calculate action probabilities from current inputs. Our results show that TD-Q has di culties to learn appropriate shared EFs. PIPE, however, does not depend on EFs and nds good policies faster and more reliably.
Rafal Salustowicz, Marco Wiering, Jürgen Schm
Added 08 Aug 2010
Updated 08 Aug 2010
Type Conference
Year 1997
Where ICANN
Authors Rafal Salustowicz, Marco Wiering, Jürgen Schmidhuber
Comments (0)