Sciweavers

CORR
2012
Springer

192views Education» more CORR 2012»

The best of both worlds: stochastic and adversarial bandits

12 years 6 days ago

We present a bandit algorithm, SAO (Stochastic and Adversarial Optimal), whose regret is, essentially, optimal both for adversarial rewards and for stochastic rewards. Speciﬁcal...

Sébastien Bubeck, Aleksandrs Slivkins

claim paper

Read More »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers