Sciweavers

97 search results - page 2 / 20
» Logarithmic Regret Algorithms for Online Convex Optimization
Sort
View
COLT
2010
Springer
13 years 3 months ago
Optimal Algorithms for Online Convex Optimization with Multi-Point Bandit Feedback
Bandit convex optimization is a special case of online convex optimization with partial information. In this setting, a player attempts to minimize a sequence of adversarially gen...
Alekh Agarwal, Ofer Dekel, Lin Xiao
CORR
2006
Springer
172views Education» more  CORR 2006»
13 years 5 months ago
Approximate Convex Optimization by Online Game Playing
This paper describes a general framework for converting online game playing algorithms into constrained convex optimization algorithms. This framework allows us to convert the wel...
Elad Hazan
CORR
2011
Springer
210views Education» more  CORR 2011»
12 years 12 months ago
Online Learning of Rested and Restless Bandits
In this paper we study the online learning problem involving rested and restless multiarmed bandits with multiple plays. The system consists of a single player/user and a set of K...
Cem Tekin, Mingyan Liu
ALT
2008
Springer
14 years 1 months ago
Online Regret Bounds for Markov Decision Processes with Deterministic Transitions
Abstract. We consider an upper confidence bound algorithm for Markov decision processes (MDPs) with deterministic transitions. For this algorithm we derive upper bounds on the onl...
Ronald Ortner
CORR
2010
Springer
116views Education» more  CORR 2010»
13 years 5 months ago
Adaptive Bound Optimization for Online Convex Optimization
We introduce a new online convex optimization algorithm that adaptively chooses its regularization function based on the loss functions observed so far. This is in contrast to pre...
H. Brendan McMahan, Matthew J. Streeter