Sciweavers

2 search results - page 1 / 1
» Piecewise-stationary bandit problems with side observations
Sort
View
ICML
2009
IEEE
14 years 10 months ago
Piecewise-stationary bandit problems with side observations
We consider a sequential decision problem where the rewards are generated by a piecewise-stationary distribution. However, the different reward distributions are unknown and may c...
Jia Yuan Yu, Shie Mannor
ICML
2008
IEEE
14 years 10 months ago
Exploration scavenging
We examine the problem of evaluating a policy in the contextual bandit setting using only observations collected during the execution of another policy. We show that policy evalua...
John Langford, Alexander L. Strehl, Jennifer Wortm...