cumulative regret | Sciweavers

7

CORR
2008
Springer

64views Education» more CORR 2008»

13 years 4 months ago

We consider bandit problems involving a large (possibly infinite) collection of arms, in which the expected reward of each arm is a linear function of an r-dimensional random vect...

Paat Rusmevichientong, John N. Tsitsiklis

claim paper

Read More »

16

click to vote

COLT
2007
Springer

174views Machine Learning» more COLT 2007»

Regret to the Best vs. Regret to the Average

13 years 11 months ago

Download www.math.tau.ac.il

Abstract. We study online regret minimization algorithms in a bicriteria setting, examining not only the standard notion of regret to the best expert, but also the regret to the av...

Eyal Even-Dar, Michael J. Kearns, Yishay Mansour, ...

claim paper

Read More »

10

click to vote

ALT
2009
Springer

128views Machine Learning» more ALT 2009»

Pure Exploration in Multi-armed Bandits Problems

14 years 1 months ago

Download sequel.futurs.inria.fr

Abstract. We consider the framework of stochastic multi-armed bandit problems and study the possibilities and limitations of strategies that explore sequentially the arms. The stra...

Sébastien Bubeck, Rémi Munos, Gilles...

claim paper

Read More »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers