Search Sciweavers | Sciweavers

66 search results - page 7 / 14

» The Nonstochastic Multiarmed Bandit Problem

207

click to vote

TSP
2010

170views Artificial Intelligence» more TSP 2010»

Distributed learning in multi-armed bandit with multiple players

15 years 2 months ago

Download www.ece.ucdavis.edu

We formulate and study a decentralized multi-armed bandit (MAB) problem. There are distributed players competing for independent arms. Each arm, when played, offers i.i.d. reward a...

Keqin Liu, Qing Zhao

claim paper

Read More »

225

click to vote

ICASSP
2011
IEEE

177views Signal Processing» more ICASSP 2011»

Logarithmic weak regret of non-Bayesian restless multi-armed bandit

14 years 11 months ago

Download www.ece.ucdavis.edu

Abstract—We consider the restless multi-armed bandit (RMAB) problem with unknown dynamics. At each time, a player chooses K out of N (N > K) arms to play. The state of each ar...

Haoyang Liu, Keqin Liu, Qing Zhao

claim paper

Read More »

216

click to vote

COLT
2010
Springer

191views Machine Learning» more COLT 2010»

Best Arm Identification in Multi-Armed Bandits

15 years 5 months ago

Download www.di.ens.fr

We consider the problem of finding the best arm in a stochastic multi-armed bandit game. The regret of a forecaster is here defined by the gap between the mean reward of the optim...

Jean-Yves Audibert, Sébastien Bubeck, R&eac...

claim paper

Read More »

286

click to vote

Publication

466views

Multi-Armed Bandit Mechanisms for Multi-Slot Sponsored Search Auctions

16 years 6 months ago

Download arxiv.org

In pay-per click sponsored search auctions which are cur- rently extensively used by search engines, the auction for a keyword involves a certain number of advertisers (say k) c...

Akash Das Sarma, Sujit Gujar, Y. Narahari

posted by sujit

Read More »

272

click to vote

CORR
2010
Springer

189views Education» more CORR 2010»

An Optimal Dynamic Mechanism for Multi-Armed Bandit Processes

15 years 7 months ago

Download research.microsoft.com

We consider the problem of revenue-optimal dynamic mechanism design in settings where agents' types evolve over time as a function of their (both public and private) experien...

Sham M. Kakade, Ilan Lobel, Hamid Nazerzadeh

claim paper

Read More »

« Prev « First page 7 / 14 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers