Search Sciweavers | Sciweavers

74 search results - page 2 / 15

» Regret Bounds for Gaussian Process Bandit Problems

click to vote

ICASSP
2011
IEEE

177views Signal Processing» more ICASSP 2011»

Logarithmic weak regret of non-Bayesian restless multi-armed bandit

12 years 8 months ago

Download www.ece.ucdavis.edu

Abstract—We consider the restless multi-armed bandit (RMAB) problem with unknown dynamics. At each time, a player chooses K out of N (N > K) arms to play. The state of each ar...

Haoyang Liu, Keqin Liu, Qing Zhao

claim paper

Read More »

click to vote

ICML
2007
IEEE

139views Machine Learning» more ICML 2007»

Multi-armed bandit problems with dependent arms

14 years 5 months ago

Download www.cs.cmu.edu

We provide a framework to exploit dependencies among arms in multi-armed bandit problems, when the dependencies are in the form of a generative model on clusters of arms. We find ...

Sandeep Pandey, Deepayan Chakrabarti, Deepak Agarw...

claim paper

Read More »

click to vote

ALT
2007
Springer

134views Machine Learning» more ALT 2007»

Tuning Bandit Algorithms in Stochastic Environments

14 years 1 months ago

Download www.sztaki.hu

Algorithms based on upper-conﬁdence bounds for balancing exploration and exploitation are gaining popularity since they are easy to implement, eﬃcient and eﬀective. In this p...

Jean-Yves Audibert, Rémi Munos, Csaba Szepe...

claim paper

Read More »

click to vote

ALT
2008
Springer

141views Machine Learning» more ALT 2008»

Online Regret Bounds for Markov Decision Processes with Deterministic Transitions

14 years 1 months ago

Download personal.unileoben.ac.at

Abstract. We consider an upper conﬁdence bound algorithm for Markov decision processes (MDPs) with deterministic transitions. For this algorithm we derive upper bounds on the onl...

Ronald Ortner

claim paper

Read More »

click to vote

COLT
2008
Springer

140views Machine Learning» more COLT 2008»

Regret Bounds for Sleeping Experts and Bandits

13 years 6 months ago

Download colt2008.cs.helsinki.fi

We study on-line decision problems where the set of actions that are available to the decision algorithm vary over time. With a few notable exceptions, such problems remained larg...

Robert D. Kleinberg, Alexandru Niculescu-Mizil, Yo...

claim paper

Read More »

« Prev « First page 2 / 15 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers