Search Sciweavers | Sciweavers

15

COLT
2008
Springer

115views Machine Learning» more COLT 2008»

Competing in the Dark: An Efficient Algorithm for Bandit Linear Optimization

13 years 6 months ago

We introduce an efficient algorithm for the problem of online linear optimization in the bandit setting which achieves the optimal O ( T) regret. The setting is a natural general...

Jacob Abernethy, Elad Hazan, Alexander Rakhlin

claim paper

Read More »

14

click to vote

SIAMCOMP
2002

124views more SIAMCOMP 2002»

The Nonstochastic Multiarmed Bandit Problem

13 years 4 months ago

Download homes.dsi.unimi.it

Abstract. In the multiarmed bandit problem, a gambler must decide which arm of K nonidentical slot machines to play in a sequence of trials so as to maximize his reward. This class...

Peter Auer, Nicolò Cesa-Bianchi, Yoav Freun...

claim paper

Read More »

14

click to vote

JMLR
2012

165views Programming Languages» more JMLR 2012»

PAC-Bayes-Bernstein Inequality for Martingales and its Application to Multiarmed Bandits

11 years 7 months ago

Download homes.di.unimi.it

We develop a new tool for data-dependent analysis of the exploration-exploitation trade-oﬀ in learning under limited feedback. Our tool is based on two main ingredients. The ﬁ...

Yevgeny Seldin, Nicolò Cesa-Bianchi, Peter ...

claim paper

Read More »

14

click to vote

CORR
2012
Springer

192views Education» more CORR 2012»

The best of both worlds: stochastic and adversarial bandits

12 years 17 days ago

Download www.princeton.edu

We present a bandit algorithm, SAO (Stochastic and Adversarial Optimal), whose regret is, essentially, optimal both for adversarial rewards and for stochastic rewards. Speciﬁcal...

Sébastien Bubeck, Aleksandrs Slivkins

claim paper

Read More »

15

click to vote

ECML
2005
Springer

105views Machine Learning» more ECML 2005»

Multi-armed Bandit Algorithms and Empirical Evaluation

13 years 10 months ago

Download www.cs.nyu.edu

The multi-armed bandit problem for a gambler is to decide which arm of a K-slot machine to pull to maximize his total reward in a series of trials. Many real-world learning and opt...

Joannès Vermorel, Mehryar Mohri

claim paper

Read More »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers