We introduce a new algorithm based on linear programming that approximates the differential value function of an average-cost Markov decision process via a linear combination of p...
A new spectral approach to value function approximation has recently been proposed to automatically construct basis functions from samples. Global basis functions called proto-val...
We introduce a class of learning problems where the agent is presented with a series of tasks. Intuitively, if there is relation among those tasks, then the information gained duri...
We investigate methods for planning in a Markov Decision Process where the cost function is chosen by an adversary after we fix our policy. As a running example, we consider a rob...
H. Brendan McMahan, Geoffrey J. Gordon, Avrim Blum
Record linkage, the problem of determining when two records refer to the same entity, has applications for both data cleaning (deduplication) and for integrating data from multipl...