Sciweavers

CORR
2012
Springer

Towards minimax policies for online linear optimization with bandit feedback

12 years 7 days ago
Towards minimax policies for online linear optimization with bandit feedback
We address the online linear optimization problem with bandit feedback. Our contribution is twofold. First, we provide an algorithm (based on exponential weights) with a regret of order √ dn log N for any finite action set with N actions, under the assumption that the instan
Sébastien Bubeck, Nicolò Cesa-Bianch
Added 20 Apr 2012
Updated 20 Apr 2012
Type Journal
Year 2012
Where CORR
Authors Sébastien Bubeck, Nicolò Cesa-Bianchi, Sham M. Kakade
Comments (0)