Sciweavers

44 search results - page 1 / 9
» A structured multiarmed bandit problem and the greedy policy
Sort
View
CDC
2008
IEEE
104views Control Systems» more  CDC 2008»
13 years 11 months ago
A structured multiarmed bandit problem and the greedy policy
—We consider a multiarmed bandit problem where the expected reward of each arm is a linear function of an unknown scalar with a prior distribution. The objective is to choose a s...
Adam J. Mersereau, Paat Rusmevichientong, John N. ...
SDM
2007
SIAM
167views Data Mining» more  SDM 2007»
13 years 6 months ago
Bandits for Taxonomies: A Model-based Approach
We consider a novel problem of learning an optimal matching, in an online fashion, between two feature spaces that are organized as taxonomies. We formulate this as a multi-armed ...
Sandeep Pandey, Deepak Agarwal, Deepayan Chakrabar...
ICML
2007
IEEE
14 years 5 months ago
Multi-armed bandit problems with dependent arms
We provide a framework to exploit dependencies among arms in multi-armed bandit problems, when the dependencies are in the form of a generative model on clusters of arms. We find ...
Sandeep Pandey, Deepayan Chakrabarti, Deepak Agarw...
ECML
2005
Springer
13 years 10 months ago
Multi-armed Bandit Algorithms and Empirical Evaluation
The multi-armed bandit problem for a gambler is to decide which arm of a K-slot machine to pull to maximize his total reward in a series of trials. Many real-world learning and opt...
Joannès Vermorel, Mehryar Mohri
CORR
2010
Springer
143views Education» more  CORR 2010»
13 years 1 months ago
The Non-Bayesian Restless Multi-Armed Bandit: a Case of Near-Logarithmic Regret
In the classic Bayesian restless multi-armed bandit (RMAB) problem, there are N arms, with rewards on all arms evolving at each time as Markov chains with known parameters. A play...
Wenhan Dai, Yi Gai, Bhaskar Krishnamachari, Qing Z...