Search Sciweavers | Sciweavers

16

CORR
2011
Springer

202views Education» more CORR 2011»

Online Least Squares Estimation with Self-Normalized Processes: An Application to Bandit Problems

12 years 11 months ago

The analysis of online least squares estimation is at the heart of many stochastic sequential decision-making problems. We employ tools from the self-normalized processes to provi...

Yasin Abbasi-Yadkori, Dávid Pál, Csa...

claim paper

Read More »

10

click to vote

COLT
2007
Springer

144views Machine Learning» more COLT 2007»

Improved Rates for the Stochastic Continuum-Armed Bandit Problem

13 years 11 months ago

Download www.sztaki.hu

Abstract. Considering one-dimensional continuum-armed bandit problems, we propose an improvement of an algorithm of Kleinberg and a new set of conditions which give rise to improve...

Peter Auer, Ronald Ortner, Csaba Szepesvári

claim paper

Read More »

10

click to vote

COLT
2008
Springer

140views Machine Learning» more COLT 2008»

Regret Bounds for Sleeping Experts and Bandits

13 years 6 months ago

Download colt2008.cs.helsinki.fi

We study on-line decision problems where the set of actions that are available to the decision algorithm vary over time. With a few notable exceptions, such problems remained larg...

Robert D. Kleinberg, Alexandru Niculescu-Mizil, Yo...

claim paper

Read More »

22

click to vote

COLT
2010
Springer

207views Machine Learning» more COLT 2010»

An Asymptotically Optimal Bandit Algorithm for Bounded Support Models

13 years 2 months ago

Download www.colt2010.org

Multiarmed bandit problem is a typical example of a dilemma between exploration and exploitation in reinforcement learning. This problem is expressed as a model of a gambler playi...

Junya Honda, Akimichi Takemura

claim paper

Read More »

13

click to vote

CORR
2010
Springer

152views Education» more CORR 2010»

Combinatorial Network Optimization with Unknown Variables: Multi-Armed Bandits with Linear Rewards

12 years 11 months ago

Download ceng.usc.edu

In the classic multi-armed bandits problem, the goal is to have a policy for dynamically operating arms that each yield stochastic rewards with unknown means. The key metric of int...

Yi Gai, Bhaskar Krishnamachari, Rahul Jain

claim paper

Read More »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers