Bandit algorithms are concerned with trading exploration with exploitation where a number of options are available but we can only learn their quality by experimenting with them. ...
Many interesting problems, such as power grids, network switches, and tra c ow, that are candidates for solving with reinforcement learningRL, alsohave properties that make distri...
Jeff G. Schneider, Weng-Keen Wong, Andrew W. Moore...
Given a social network, who is the best person to introduce you to, say, Chris Ferguson, the poker champion? Or, given a network of people and skills, who is the best person to he...
In this paper we present a meeting state recognizer based on a combination of multi-modal sensor data in a smart room. Our approach is based on the training of a statistical model ...
We consider the problem of efficiently learning optimal control policies and value functions over large state spaces in an online setting in which estimates must be available afte...