Sciweavers

9 search results - page 1 / 2
» Optimal Algorithms for Online Convex Optimization with Multi...
Sort
View
COLT
2010
Springer
13 years 2 months ago
Optimal Algorithms for Online Convex Optimization with Multi-Point Bandit Feedback
Bandit convex optimization is a special case of online convex optimization with partial information. In this setting, a player attempts to minimize a sequence of adversarially gen...
Alekh Agarwal, Ofer Dekel, Lin Xiao
NIPS
2004
13 years 6 months ago
Nearly Tight Bounds for the Continuum-Armed Bandit Problem
In the multi-armed bandit problem, an online algorithm must choose from a set of strategies in a sequence of n trials so as to minimize the total cost of the chosen strategies. Wh...
Robert D. Kleinberg
CORR
2012
Springer
210views Education» more  CORR 2012»
12 years 9 days ago
Towards minimax policies for online linear optimization with bandit feedback
We address the online linear optimization problem with bandit feedback. Our contribution is twofold. First, we provide an algorithm (based on exponential weights) with a regret of...
Sébastien Bubeck, Nicolò Cesa-Bianch...
CORR
2004
Springer
103views Education» more  CORR 2004»
13 years 4 months ago
Online convex optimization in the bandit setting: gradient descent without a gradient
We study a general online convex optimization problem. We have a convex set S and an unknown sequence of cost functions c1, c2, . . . , and in each period, we choose a feasible po...
Abraham Flaxman, Adam Tauman Kalai, H. Brendan McM...
ICML
2009
IEEE
14 years 5 months ago
Interactively optimizing information retrieval systems as a dueling bandits problem
We present an on-line learning framework tailored towards real-time learning from observed user behavior in search engines and other information retrieval systems. In particular, ...
Yisong Yue, Thorsten Joachims