Sciweavers

66 search results - page 1 / 14
» The Nonstochastic Multiarmed Bandit Problem
Sort
View
COLT
2008
Springer
13 years 6 months ago
Competing in the Dark: An Efficient Algorithm for Bandit Linear Optimization
We introduce an efficient algorithm for the problem of online linear optimization in the bandit setting which achieves the optimal O ( T) regret. The setting is a natural general...
Jacob Abernethy, Elad Hazan, Alexander Rakhlin
SIAMCOMP
2002
124views more  SIAMCOMP 2002»
13 years 4 months ago
The Nonstochastic Multiarmed Bandit Problem
Abstract. In the multiarmed bandit problem, a gambler must decide which arm of K nonidentical slot machines to play in a sequence of trials so as to maximize his reward. This class...
Peter Auer, Nicolò Cesa-Bianchi, Yoav Freun...
JMLR
2012
11 years 7 months ago
PAC-Bayes-Bernstein Inequality for Martingales and its Application to Multiarmed Bandits
We develop a new tool for data-dependent analysis of the exploration-exploitation trade-off in learning under limited feedback. Our tool is based on two main ingredients. The fi...
Yevgeny Seldin, Nicolò Cesa-Bianchi, Peter ...
CORR
2012
Springer
192views Education» more  CORR 2012»
12 years 17 days ago
The best of both worlds: stochastic and adversarial bandits
We present a bandit algorithm, SAO (Stochastic and Adversarial Optimal), whose regret is, essentially, optimal both for adversarial rewards and for stochastic rewards. Specifical...
Sébastien Bubeck, Aleksandrs Slivkins
ECML
2005
Springer
13 years 10 months ago
Multi-armed Bandit Algorithms and Empirical Evaluation
The multi-armed bandit problem for a gambler is to decide which arm of a K-slot machine to pull to maximize his total reward in a series of trials. Many real-world learning and opt...
Joannès Vermorel, Mehryar Mohri