Sciweavers

13 search results - page 1 / 3
» The best of both worlds: stochastic and adversarial bandits
Sort
View
CORR
2012
Springer
192views Education» more  CORR 2012»
12 years 5 days ago
The best of both worlds: stochastic and adversarial bandits
We present a bandit algorithm, SAO (Stochastic and Adversarial Optimal), whose regret is, essentially, optimal both for adversarial rewards and for stochastic rewards. Specifical...
Sébastien Bubeck, Aleksandrs Slivkins
SODA
2012
ACM
240views Algorithms» more  SODA 2012»
11 years 6 months ago
Simultaneous approximations for adversarial and stochastic online budgeted allocation
Motivated by online ad allocation, we study the problem of simultaneous approximations for the adversarial and stochastic online budgeted allocation problem. This problem consists...
Vahab S. Mirrokni, Shayan Oveis Gharan, Morteza Za...
COLT
2008
Springer
13 years 6 months ago
Regret Bounds for Sleeping Experts and Bandits
We study on-line decision problems where the set of actions that are available to the decision algorithm vary over time. With a few notable exceptions, such problems remained larg...
Robert D. Kleinberg, Alexandru Niculescu-Mizil, Yo...
SIAMCOMP
2002
124views more  SIAMCOMP 2002»
13 years 4 months ago
The Nonstochastic Multiarmed Bandit Problem
Abstract. In the multiarmed bandit problem, a gambler must decide which arm of K nonidentical slot machines to play in a sequence of trials so as to maximize his reward. This class...
Peter Auer, Nicolò Cesa-Bianchi, Yoav Freun...
CORR
2010
Springer
187views Education» more  CORR 2010»
13 years 4 months ago
Learning in A Changing World: Non-Bayesian Restless Multi-Armed Bandit
We consider the restless multi-armed bandit (RMAB) problem with unknown dynamics. In this problem, at each time, a player chooses K out of N (N > K) arms to play. The state of ...
Haoyang Liu, Keqin Liu, Qing Zhao