Search Sciweavers | Sciweavers

22 search results - page 5 / 5

» Contextual Multi-Armed Bandits

click to vote

ICML
2008
IEEE

120views Machine Learning» more ICML 2008»

Exploration scavenging

14 years 6 months ago

Download hunch.net

We examine the problem of evaluating a policy in the contextual bandit setting using only observations collected during the execution of another policy. We show that policy evalua...

John Langford, Alexander L. Strehl, Jennifer Wortm...

claim paper

Read More »

click to vote

CORR
2011
Springer

161views Education» more CORR 2011»

Doubly Robust Policy Evaluation and Learning

12 years 9 months ago

Download www.icml-2011.org

We study decision making in environments where the reward is only partially observed, but can be modeled as a function of an action and an observed context. This setting, known as...

Miroslav Dudík, John Langford, Lihong Li

claim paper

Read More »

« Prev « First page 5 / 5 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers