In this paper, we introduce a new heuristic search algorithm based on mean values for anytime planning, called MHSP. It consists in associating the principles of UCT, a bandit-base...
Bayesian games can be used to model single-shot decision problems in which agents only possess incomplete information about other agents, and hence are important for multiagent co...
Frans A. Oliehoek, Matthijs T. J. Spaan, Jilles St...
We investigate methods for planning in a Markov Decision Process where the cost function is chosen by an adversary after we fix our policy. As a running example, we consider a rob...
H. Brendan McMahan, Geoffrey J. Gordon, Avrim Blum
: We introduce a graph-theoretic generalization of classical Arrow-Debreu economics, in which an undirected graph specifies which consumers or economies are permitted to engage in...
This paper investigates the problem of policy learning in multiagent environments using the stochastic game framework, which we briefly overview. We introduce two properties as de...