We introduce the first algorithm for off-policy temporal-difference learning that is stable with linear function approximation. Off-policy learning is of interest because it forms...
— We consider a financial decision problem involving dynamic investment decisions on a single risky instrument over multiple and discrete time periods. Investment returns are as...
Ufuk Topcu, Giuseppe Carlo Calafiore, Laurent El G...
The problem of aggregating multiple numerical criteria to form overall objective functions is of considerable importance in many disciplines. The ordered weighted averaging (OWA) a...
Abstract. Many graph problems seem to require knowledge that extends beyond the immediate neighbors of a node. The usual self-stabilizing model only allows for nodes to make decisi...
Wayne Goddard, Stephen T. Hedetniemi, David Pokras...
This paper investigates relative precision and optimality of analyses for concurrent probabilistic systems. Aiming at the problem at the heart of probabilistic model checking ? com...