The paper takes a fresh look at algorithms for maximizing expected utility over a set of policies, that is, a set of possible ways of reacting to observations about an uncertain s...
We present an anytime algorithm which computes policies for decision problems represented as multi-stage influence diagrams. Our algorithm constructs policies incrementally, start...
Abstract— This paper proposes a simulation-based active policy learning algorithm for finite-horizon, partially-observed sequential decision processes. The algorithm is tested i...
Ruben Martinez-Cantin, Nando de Freitas, Arnaud Do...
In a public cloud, bandwidth is traditionally priced in a pay-asyou-go model. Reflecting the recent trend of augmenting cloud computing with bandwidth guarantees, we consider a n...
: Partially-observable Markov decision processes provide a very general model for decision-theoretic planning problems, allowing the trade-offs between various courses of actions t...