In the classic Bayesian restless multi-armed bandit (RMAB) problem, there are N arms, with rewards on all arms evolving at each time as Markov chains with known parameters. A play...
Wenhan Dai, Yi Gai, Bhaskar Krishnamachari, Qing Z...
We consider a novel problem of learning an optimal matching, in an online fashion, between two feature spaces that are organized as taxonomies. We formulate this as a multi-armed ...
We provide a framework to exploit dependencies among arms in multi-armed bandit problems, when the dependencies are in the form of a generative model on clusters of arms. We find ...
In the multi-armed bandit problem, an online algorithm must choose from a set of strategies in a sequence of n trials so as to minimize the total cost of the chosen strategies. Wh...
In a multi-armed bandit problem, an online algorithm chooses from a set of strategies in a sequence of n trials so as to maximize the total payoff of the chosen strategies. While ...