Sciweavers

ICML
2008
IEEE
14 years 5 months ago
Exploration scavenging
We examine the problem of evaluating a policy in the contextual bandit setting using only observations collected during the execution of another policy. We show that policy evalua...
John Langford, Alexander L. Strehl, Jennifer Wortm...