Sciweavers

UAI
2000

Learning to Cooperate via Policy Search

13 years 5 months ago
Learning to Cooperate via Policy Search
Cooperative games are those in which both agents share the same payoff structure. Valuebased reinforcement-learning algorithms, such as variants of Q-learning, have been applied to learning cooperative games, but they only apply when the game state is completely observable to both agents. Policy search methods are a reasonable alternative to value-based methods for partially observable environments. In this paper, we provide a gradient-based distributed policysearch method for cooperative games and compare the notion of local optimum to that of Nash equilibrium. We demonstrate the effectiveness of this method experimentally in a small, partially observable simulated soccer domain.
Leonid Peshkin, Kee-Eung Kim, Nicolas Meuleau, Les
Added 01 Nov 2010
Updated 01 Nov 2010
Type Conference
Year 2000
Where UAI
Authors Leonid Peshkin, Kee-Eung Kim, Nicolas Meuleau, Leslie Pack Kaelbling
Comments (0)