Sciweavers

Free Online Productivity Tools i2Speak i2Symbol i2OCR iTex2Img iWeb2Print iWeb2Shot i2Type iPdf2Split iPdf2Merge i2Bopomofo i2Arabic i2Style i2Image i2PDF iLatex2Rtf Sci2ools

78

Voted

CORR
2012
Springer

favoriteEmaildiscussreport

192views Education» more CORR 2012»

The best of both worlds: stochastic and adversarial bandits

13 years 6 months ago

The best of both worlds: stochastic and adversarial bandits

Download www.princeton.edu

We present a bandit algorithm, SAO (Stochastic and Adversarial Optimal), whose regret is, essentially, optimal both for adversarial rewards and for stochastic rewards. Speciﬁcally, SAO combines the O( √ n) worst-case regret of Exp3 [Auer et al., 2002b] for adversarial rewards and the (poly)logarithmic regret of UCB1 [Auer et al., 2002a] for stochastic rewards. Adversarial rewards and stochastic rewards are the two main settings in the literature on (non-Bayesian) multi-armed bandits. Prior work on multiarmed bandits treats them separately, and does not attempt to jointly optimize for both. Our result falls into a general theme of achieving good worst-case performance while also taking advantage of “nice” problem instances, an important issue in the design of algorithms with partially known inputs.

Sébastien Bubeck, Aleksandrs Slivkins

Real-time Traffic

Case Performance | CORR 2012 | Education | Nice Problem | Problem Instances |

claim paper

Related Content

» Simultaneous approximations for adversarial and stochastic online budgeted allocation

» Regret Bounds for Sleeping Experts and Bandits

» The Nonstochastic Multiarmed Bandit Problem

» Learning in A Changing World NonBayesian Restless MultiArmed Bandit

» Adapting to a Changing Environment the Brownian Restless Bandits

» Playing games with approximation algorithms

» Secure Overlay Network Design

» Computationally Sound Automated Proofs for Security Protocols

» Informationtheoretically secret key generation for fading wireless channels

Post Info
More Details (n/a)

Added	20 Apr 2012
Updated	20 Apr 2012
Type	Journal
Year	2012
Where	CORR
Authors	Sébastien Bubeck, Aleksandrs Slivkins

Comments (0)