Search Sciweavers | Sciweavers

5 search results - page 1 / 1

» Regret Bounds for Sleeping Experts and Bandits

click to vote

COLT
2008
Springer

140views Machine Learning» more COLT 2008»

Regret Bounds for Sleeping Experts and Bandits

13 years 6 months ago

Download colt2008.cs.helsinki.fi

We study on-line decision problems where the set of actions that are available to the decision algorithm vary over time. With a few notable exceptions, such problems remained larg...

Robert D. Kleinberg, Alexandru Niculescu-Mizil, Yo...

claim paper

Read More »

click to vote

COLT
2005
Springer

128views Machine Learning» more COLT 2005»

From External to Internal Regret

13 years 6 months ago

Download www.cs.cmu.edu

External regret compares the performance of an online algorithm, selecting among N actions, to the performance of the best of those actions in hindsight. Internal regret compares ...

Avrim Blum, Yishay Mansour

claim paper

Read More »

click to vote

CORR
2011
Springer

202views Education» more CORR 2011»

Online Least Squares Estimation with Self-Normalized Processes: An Application to Bandit Problems

12 years 12 months ago

Download www.ualberta.ca

The analysis of online least squares estimation is at the heart of many stochastic sequential decision-making problems. We employ tools from the self-normalized processes to provi...

Yasin Abbasi-Yadkori, Dávid Pál, Csa...

claim paper

Read More »

click to vote

JMLR
2010

103views more JMLR 2010»

Regret Bounds and Minimax Policies under Partial Monitoring

12 years 11 months ago

Download jmlr.csail.mit.edu

This work deals with four classical prediction settings, namely full information, bandit, label efficient and bandit label efficient as well as four different notions of regret: p...

Jean-Yves Audibert, Sébastien Bubeck

claim paper

Read More »

click to vote

LION
2010
Springer

190views Optimization» more LION 2010»

Algorithm Selection as a Bandit Problem with Unbounded Losses

13 years 8 months ago

Download como.vub.ac.be

Abstract. Algorithm selection is typically based on models of algorithm performance learned during a separate ofﬂine training sequence, which can be prohibitively expensive. In r...

Matteo Gagliolo, Jürgen Schmidhuber

claim paper

Read More »

« Prev « First page 1 / 1 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers