Sciweavers

179

CORR
2012
Springer

192views Education» more CORR 2012»

The best of both worlds: stochastic and adversarial bandits

14 years 2 months ago

We present a bandit algorithm, SAO (Stochastic and Adversarial Optimal), whose regret is, essentially, optimal both for adversarial rewards and for stochastic rewards. Speciﬁcal...

Sébastien Bubeck, Aleksandrs Slivkins

claim paper

Read More »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers