Sciweavers

109 search results - page 2 / 22
» Algorithm Selection as a Bandit Problem with Unbounded Losse...
Sort
View
ALT
2009
Springer
13 years 9 months ago
The Follow Perturbed Leader Algorithm Protected from Unbounded One-Step Losses
In this paper the sequential prediction problem with expert advice is considered for the case when the losses of experts suffered at each step can be unbounded. We present some mo...
Vladimir V. V'yugin
JMLR
2011
137views more  JMLR 2011»
13 years 5 hour ago
Online Learning in Case of Unbounded Losses Using Follow the Perturbed Leader Algorithm
In this paper the sequential prediction problem with expert advice is considered for the case where losses of experts suffered at each step cannot be bounded in advance. We presen...
Vladimir V. V'yugin
CP
2006
Springer
13 years 8 months ago
A Simple Distribution-Free Approach to the Max k-Armed Bandit Problem
The max k-armed bandit problem is a recently-introduced online optimization problem with practical applications to heuristic search. Given a set of k slot machines, each yielding p...
Matthew J. Streeter, Stephen F. Smith
CORR
2010
Springer
127views Education» more  CORR 2010»
13 years 5 months ago
Online Algorithms for the Multi-Armed Bandit Problem with Markovian Rewards
We consider the classical multi-armed bandit problem with Markovian rewards. When played an arm changes its state in a Markovian fashion while it remains frozen when not played. Th...
Cem Tekin, Mingyan Liu
COLT
2010
Springer
13 years 3 months ago
Optimal Algorithms for Online Convex Optimization with Multi-Point Bandit Feedback
Bandit convex optimization is a special case of online convex optimization with partial information. In this setting, a player attempts to minimize a sequence of adversarially gen...
Alekh Agarwal, Ofer Dekel, Lin Xiao