Search Sciweavers | Sciweavers

73 search results - page 1 / 15

» Stochastic Linear Optimization under Bandit Feedback

click to vote

COLT
2008
Springer

96views Machine Learning» more COLT 2008»

Stochastic Linear Optimization under Bandit Feedback

13 years 6 months ago

Download people.cs.uchicago.edu

Varsha Dani, Thomas P. Hayes, Sham M. Kakade

claim paper

Read More »

click to vote

CORR
2012
Springer

210views Education» more CORR 2012»

Towards minimax policies for online linear optimization with bandit feedback

12 years 20 days ago

Download www.princeton.edu

We address the online linear optimization problem with bandit feedback. Our contribution is twofold. First, we provide an algorithm (based on exponential weights) with a regret of...

Sébastien Bubeck, Nicolò Cesa-Bianch...

claim paper

Read More »

click to vote

FOCS
2007
IEEE

157views Theoretical Computer Science» more FOCS 2007»

Approximation Algorithms for Partial-Information Based Stochastic Control with Markovian Rewards

13 years 11 months ago

Download www.cis.upenn.edu

We consider a variant of the classic multi-armed bandit problem (MAB), which we call FEEDBACK MAB, where the reward obtained by playing each of n independent arms varies according...

Sudipto Guha, Kamesh Munagala

claim paper

Read More »

click to vote

JMLR
2010

103views more JMLR 2010»

Regret Bounds and Minimax Policies under Partial Monitoring

12 years 11 months ago

Download jmlr.csail.mit.edu

This work deals with four classical prediction settings, namely full information, bandit, label efficient and bandit label efficient as well as four different notions of regret: p...

Jean-Yves Audibert, Sébastien Bubeck

claim paper

Read More »

click to vote

JMLR
2012

165views Programming Languages» more JMLR 2012»

PAC-Bayes-Bernstein Inequality for Martingales and its Application to Multiarmed Bandits

11 years 7 months ago

Download homes.di.unimi.it

We develop a new tool for data-dependent analysis of the exploration-exploitation trade-oﬀ in learning under limited feedback. Our tool is based on two main ingredients. The ﬁ...

Yevgeny Seldin, Nicolò Cesa-Bianchi, Peter ...

claim paper

Read More »

« Prev « First page 1 / 15 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers