bandit | Sciweavers

23

JMLR
2012

200views Programming Languages» more JMLR 2012»

Contextual Bandit Learning with Predictable Rewards

11 years 7 months ago

Contextual bandit learning is a reinforcement learning problem where the learner repeatedly receives a set of features (context), takes an action and receives a reward based on th...

Alekh Agarwal, Miroslav Dudík, Satyen Kale,...

claim paper

Read More »

29

click to vote

AMAI
2011
Springer

273views Artificial Intelligence» more AMAI 2011»

Multi-armed bandits with episode context

12 years 4 months ago

Download gauss.ececs.uc.edu

A multi-armed bandit episode consists of n trials, each allowing selection of one of K arms, resulting in payoff from a distribution over [0, 1] associated with that arm. We assum...

Christopher D. Rosin

claim paper

Read More »

21

click to vote

AGI
2011

231views Artificial Intelligence» more AGI 2011»

Reinforcement Learning and the Bayesian Control Rule

12 years 8 months ago

Download metatip.com

We present an actor-critic scheme for reinforcement learning in complex domains. The main contribution is to show that planning and I/O dynamics can be separated such that an intra...

Pedro Alejandro Ortega, Daniel Alexander Braun, Si...

claim paper

Read More »

16

click to vote

JMLR
2010

103views more JMLR 2010»

Regret Bounds and Minimax Policies under Partial Monitoring

12 years 11 months ago

Download jmlr.csail.mit.edu

This work deals with four classical prediction settings, namely full information, bandit, label efficient and bandit label efficient as well as four different notions of regret: p...

Jean-Yves Audibert, Sébastien Bubeck

claim paper

Read More »

19

click to vote

CORR
2008
Springer

136views Education» more CORR 2008»

Multi-Armed Bandits in Metric Spaces

13 years 4 months ago

Download www.cs.cornell.edu

In a multi-armed bandit problem, an online algorithm chooses from a set of strategies in a sequence of n trials so as to maximize the total payoff of the chosen strategies. While ...

Robert Kleinberg, Aleksandrs Slivkins, Eli Upfal

claim paper

Read More »

15

click to vote

SDM
2007
SIAM

167views Data Mining» more SDM 2007»

Bandits for Taxonomies: A Model-based Approach

13 years 6 months ago

Download www.cs.cmu.edu

We consider a novel problem of learning an optimal matching, in an online fashion, between two feature spaces that are organized as taxonomies. We formulate this as a multi-armed ...

Sandeep Pandey, Deepak Agarwal, Deepayan Chakrabar...

claim paper

Read More »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers