Sciweavers

ICML
2005
IEEE

Learning to compete, compromise, and cooperate in repeated general-sum games

14 years 4 months ago
Learning to compete, compromise, and cooperate in repeated general-sum games
Learning algorithms often obtain relatively low average payoffs in repeated general-sum games between other learning agents due to a focus on myopic best-response and one-shot Nash equilibrium (NE) strategies. A less myopic approach places focus on NEs of the repeated game, which suggests that (at the least) a learning agent should possess two properties. First, an agent should never learn to play a strategy that produces average payoffs less than the minimax value of the game. Second, an agent should learn to cooperate/compromise when beneficial. No learning algorithm from the literature is known to possess both of these properties. We present a reinforcement learning algorithm (M-Qubed) that provably satisfies the first property and empirically displays (in self play) the second property in a wide range of games.
Jacob W. Crandall, Michael A. Goodrich
Added 17 Nov 2009
Updated 17 Nov 2009
Type Conference
Year 2005
Where ICML
Authors Jacob W. Crandall, Michael A. Goodrich
Comments (0)