We formalize the associative bandit problem framework introduced by Kaelbling as a learning-theory problem. The learning environment is modeled as a k-armed bandit where arm payof...
Alexander L. Strehl, Chris Mesterharm, Michael L. ...
We consider a combinatorial generalization of the classical multi-armed bandit problem that is defined as follows. There is a given bipartite graph of M users and N M resources. F...
In this paper we consider the problem of actively learning the mean values of distributions associated with a finite number of options (arms). The algorithms can select which opti...
In the multi-armed bandit (MAB) problem there are k distributions associated with the rewards of playing each of k strategies (slot machine arms). The reward distributions are ini...
In pay-per click sponsored search auctions which are cur-
rently extensively used by search engines, the auction for
a keyword involves a certain number of advertisers (say k)
c...