Off-policy reinforcement learning is aimed at efficiently reusing data samples gathered in the past, which is an essential problem for physically grounded AI as experiments are us...
Abstract--We present a number of significant engineering insights on what makes a good configuration for medium- to largesize wireless mesh networks (WMNs) when the objective funct...
In this paper we consider approximate policy-iteration-based reinforcement learning algorithms. In order to implement a flexible function approximation scheme we propose the use o...
Amir Massoud Farahmand, Mohammad Ghavamzadeh, Csab...
This paper describes a method of supervised learning based on forward selection branching. This method improves fault tolerance by means of combining information related to general...
We consider the problem of cooperative multiagent planning under uncertainty, formalized as a decentralized partially observable Markov decision process (Dec-POMDP). Unfortunately...
Matthijs T. J. Spaan, Frans A. Oliehoek, Nikos A. ...