Sciweavers

ATAL
2015
Springer

Nonparametric Bayesian Learning of Other Agents? Policies in Interactive POMDPs

8 years 5 days ago
Nonparametric Bayesian Learning of Other Agents? Policies in Interactive POMDPs
We consider an autonomous agent facing a partially observable, stochastic, multiagent environment where the unknown policies of other agents are represented as finite state controllers (FSCs). We show how an agent can (i) learn the FSCs of the other agents, and (ii) exploit these models during interactions. To separate the issues of off-line versus on-line learning we consider here an off-line two-phase approach. During the first phase the agent observes as the other player(s) are interacting with the environment (the observations may be imperfect and the learning agent is not taking part in the interaction.) The collected data is used to learn an ensemble of FSCs that explain the behavior of the other agent(s) using a Bayesian non-parametric (BNP) approach. We verify the quality of the learned models during the second phase by allowing the agent to compute its own optimal policy and interact with the observed agent. The optimal policy for the learning agent is obtained by solving a...
Alessandro Panella, Piotr J. Gmytrasiewicz
Added 16 Apr 2016
Updated 16 Apr 2016
Type Journal
Year 2015
Where ATAL
Authors Alessandro Panella, Piotr J. Gmytrasiewicz
Comments (0)