Sciweavers

Free Online Productivity Tools i2Speak i2Symbol i2OCR iTex2Img iWeb2Print iWeb2Shot i2Type iPdf2Split iPdf2Merge i2Bopomofo i2Arabic i2Style i2Image i2PDF iLatex2Rtf Sci2ools

111

Voted

ICML
2007
IEEE

favoriteEmaildiscussreport

139views Machine Learning» more ICML 2007»

Multi-armed bandit problems with dependent arms

16 years 2 months ago

Multi-armed bandit problems with dependent arms

Download www.cs.cmu.edu

We provide a framework to exploit dependencies among arms in multi-armed bandit problems, when the dependencies are in the form of a generative model on clusters of arms. We find an optimal MDP-based policy for the discounted reward case, and also give an approximation of it with formal error guarantee. We discuss lower bounds on regret in the undiscounted reward scenario, and propose a general two-level bandit policy for it. We propose three different instantiations of our general policy and provide theoretical justifications of how the regret of the instantiated policies depend on the characteristics of the clusters. Finally, we empirically demonstrate the efficacy of our policies on large-scale realworld and synthetic data, and show that they significantly outperform classical policies designed for bandits with independent arms.

Sandeep Pandey, Deepayan Chakrabarti, Deepak Agarw

Real-time Traffic

ICML 2007 | Machine Learning | Optimal Mdp-based Policy | Two-level Bandit Policy | Undiscounted Reward Scenario |

claim paper

Related Content

» Best Arm Identification in MultiArmed Bandits

» On the Combinatorial MultiArmed Bandit Problem with Markovian Rewards

» Combinatorial Network Optimization with Unknown Variables MultiArmed Bandits with Linear R...

» The NonBayesian Restless MultiArmed Bandit a Case of NearLogarithmic Regret

» An Optimal Dynamic Mechanism for MultiArmed Bandit Processes

» MultiArmed Bandits in Metric Spaces

» Mortal MultiArmed Bandits

» Learning in A Changing World NonBayesian Restless MultiArmed Bandit

» Online Algorithms for the MultiArmed Bandit Problem with Markovian Rewards

» How to Beat the Adaptive MultiArmed Bandit

Post Info
More Details (n/a)

Added	17 Nov 2009
Updated	17 Nov 2009
Type	Conference
Year	2007
Where	ICML
Authors	Sandeep Pandey, Deepayan Chakrabarti, Deepak Agarwal

Comments (0)