We present a learning framework for Markovian decision processes that is based on optimization in the policy space. Instead of using relatively slow gradient-based optimization al...
—This paper introduces an algorithm for direct search of control policies in continuous-state discrete-action Markov decision processes. The algorithm looks for the best closed-l...
Lucian Busoniu, Damien Ernst, Bart De Schutter, Ro...
Decentralized partially observable Markov decision processes (Dec-POMDPs) constitute a generic and expressive framework for multiagent planning under uncertainty. However, plannin...
Frans A. Oliehoek, Shimon Whiteson, Matthijs T. J....
POMDPs and their decentralized multiagent counterparts, DEC-POMDPs, offer a rich framework for sequential decision making under uncertainty. Their computational complexity, howeve...
Christopher Amato, Daniel S. Bernstein, Shlomo Zil...