Search Sciweavers | Sciweavers

103 search results - page 1 / 21

» An Asymptotically Optimal Bandit Algorithm for Bounded Suppo...

click to vote

COLT
2010
Springer

207views Machine Learning» more COLT 2010»

An Asymptotically Optimal Bandit Algorithm for Bounded Support Models

13 years 2 months ago

Download www.colt2010.org

Multiarmed bandit problem is a typical example of a dilemma between exploration and exploitation in reinforcement learning. This problem is expressed as a model of a gambler playi...

Junya Honda, Akimichi Takemura

claim paper

Read More »

click to vote

ALT
2008
Springer

141views Machine Learning» more ALT 2008»

Online Regret Bounds for Markov Decision Processes with Deterministic Transitions

14 years 1 months ago

Download personal.unileoben.ac.at

Abstract. We consider an upper conﬁdence bound algorithm for Markov decision processes (MDPs) with deterministic transitions. For this algorithm we derive upper bounds on the onl...

Ronald Ortner

claim paper

Read More »

click to vote

NIPS
2004

136views Information Technology» more NIPS 2004»

Nearly Tight Bounds for the Continuum-Armed Bandit Problem

13 years 5 months ago

Download books.nips.cc

In the multi-armed bandit problem, an online algorithm must choose from a set of strategies in a sequence of n trials so as to minimize the total cost of the chosen strategies. Wh...

Robert D. Kleinberg

claim paper

Read More »

click to vote

LION
2010
Springer

190views Optimization» more LION 2010»

Algorithm Selection as a Bandit Problem with Unbounded Losses

13 years 8 months ago

Download como.vub.ac.be

Abstract. Algorithm selection is typically based on models of algorithm performance learned during a separate ofﬂine training sequence, which can be prohibitively expensive. In r...

Matteo Gagliolo, Jürgen Schmidhuber

claim paper

Read More »

click to vote

JMLR
2010

125views more JMLR 2010»

Regret Bounds for Gaussian Process Bandit Problems

12 years 11 months ago

Download jmlr.csail.mit.edu

Bandit algorithms are concerned with trading exploration with exploitation where a number of options are available but we can only learn their quality by experimenting with them. ...

Steffen Grünewälder, Jean-Yves Audibert,...

claim paper

Read More »

« Prev « First page 1 / 21 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers