—We propose a dynamic spectrum access scheme where secondary users recommend “good” channels to each other and access accordingly. We formulate the problem as an average rewa...
As agents begin to perform complex tasks alongside humans as collaborative teammates, it becomes crucial that the resulting humanmultiagent teams adapt to time-critical domains. I...
Decentralized partially observable Markov decision processes (Dec-POMDPs) constitute a generic and expressive framework for multiagent planning under uncertainty. However, plannin...
Frans A. Oliehoek, Shimon Whiteson, Matthijs T. J....
A problem of planning for cooperative teams under uncertainty is a crucial one in multiagent systems. Decentralized partially observable Markov decision processes (DECPOMDPs) prov...
The problem of reinforcement learning in large factored Markov decision processes is explored. The Q-value of a state-action pair is approximated by the free energy of a product o...