Many program optimisations and analyses, such as arraybound checking, termination analysis, etc, dependon knowing the size of a function's input and output. However, size inf...
This paper presents the CQ algorithm which decomposes and solves a Markov Decision Process (MDP) by automatically generating a hierarchy of smaller MDPs using state variables. The ...
We consider a portfolio allocation problem where the objective function is a tail event such as probability of large portfolio losses. The dependence between assets is captured th...
In reinforcement learning, least-squares temporal difference methods (e.g., LSTD and LSPI) are effective, data-efficient techniques for policy evaluation and control with linear v...
Michael H. Bowling, Alborz Geramifard, David Winga...
Abstract-- A network of nodes communicate via noisy channels. Each node has some real-valued initial measurement or message. The goal of each of the nodes is to acquire an estimate...