Abstract. Algorithm selection is typically based on models of algorithm performance learned during a separate offline training sequence, which can be prohibitively expensive. In r...
In this paper the sequential prediction problem with expert advice is considered when the loss is unbounded under partial monitoring scenarios. We deal with a wide class of the par...
In this paper we consider the problem of actively learning the mean values of distributions associated with a finite number of options (arms). The algorithms can select which opti...
We formalize the associative bandit problem framework introduced by Kaelbling as a learning-theory problem. The learning environment is modeled as a k-armed bandit where arm payof...
Alexander L. Strehl, Chris Mesterharm, Michael L. ...