A selection-mutation model for q-learning in multi-agent systems

13 years 9 months ago

Download www.personeel.unimaas.nl

Although well understood in the single-agent framework, the use of traditional reinforcement learning (RL) algorithms in multi-agent systems (MAS) is not always justiﬁed. The feedback an agent experiences in a MAS, is usually inﬂuenced by the other agents present in the system. Multi agent environments are therefore non-stationary and convergence and optimality guarantees of RL algorithms are lost. To better understand the dynamics of traditional RL algorithms we analyze the learning process in terms of evolutionary dynamics. More speciﬁcally we show how the Replicator Dynamics (RD) can be used as a model for Q-learning in games. The dynamical equations of Q-learning are derived and illustrated by some well chosen experiments. Both reveal an interesting connection between the exploitationexploration scheme from RL and the selection-mutation mechanisms from evolutionary game theory. Categories and Subject Descriptors I.2 [Artiﬁcial Intelligence]: Learning General Terms Theory, ...

Karl Tuyls, Katja Verbeeck, Tom Lenaerts

Real-time Traffic

ATAL 2003 | Reinforcement Learning | Replicator Dynamics | RL Algorithms |

claim paper

Post Info
More Details (n/a)

Added	06 Jul 2010
Updated	06 Jul 2010
Type	Conference
Year	2003
Where	ATAL
Authors	Karl Tuyls, Katja Verbeeck, Tom Lenaerts

Comments (0)

Sciweavers

A selection-mutation model for q-learning in multi-agent systems

ATAL 2003 | Reinforcement Learning | Replicator Dynamics | RL Algorithms |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers