Sciweavers

AAAI
2011

Coordinated Multi-Agent Reinforcement Learning in Networked Distributed POMDPs

12 years 4 months ago
Coordinated Multi-Agent Reinforcement Learning in Networked Distributed POMDPs
In many multi-agent applications such as distributed sensor nets, a network of agents act collaboratively under uncertainty and local interactions. Networked Distributed POMDP (ND-POMDP) provides a framework to model such cooperative multi-agent decision making. Existing work on ND-POMDPs has focused on offline techniques that require accurate models, which are usually costly to obtain in practice. This paper presents a model-free, scalable learning approach that synthesizes multi-agent reinforcement learning (MARL) and distributed constraint optimization (DCOP). By exploiting structured interaction in ND-POMDPs, our approach distributes the learning of the joint policy and employs DCOP techniques to coordinate distributed learning to ensure the global learning performance. Our approach can learn a globally optimal policy for ND-POMDPs with a property called groupwise observability. Experimental results show that, with communication during learning and execution, our approach signiï¬...
Chongjie Zhang, Victor R. Lesser
Added 12 Dec 2011
Updated 12 Dec 2011
Type Journal
Year 2011
Where AAAI
Authors Chongjie Zhang, Victor R. Lesser
Comments (0)