Sciweavers

39 search results - page 5 / 8
» colt 2010
Sort
View
COLT
2010
Springer
13 years 4 months ago
Optimal Algorithms for Online Convex Optimization with Multi-Point Bandit Feedback
Bandit convex optimization is a special case of online convex optimization with partial information. In this setting, a player attempts to minimize a sequence of adversarially gen...
Alekh Agarwal, Ofer Dekel, Lin Xiao
COLT
2010
Springer
13 years 4 months ago
An Asymptotically Optimal Bandit Algorithm for Bounded Support Models
Multiarmed bandit problem is a typical example of a dilemma between exploration and exploitation in reinforcement learning. This problem is expressed as a model of a gambler playi...
Junya Honda, Akimichi Takemura
COLT
2010
Springer
13 years 4 months ago
Convex Games in Banach Spaces
We study the regret of an online learner playing a multi-round game in a Banach space B against an adversary that plays a convex function at each round. We characterize the minima...
Karthik Sridharan, Ambuj Tewari
COLT
2010
Springer
13 years 4 months ago
Strongly Non-U-Shaped Learning Results by General Techniques
In learning, a semantic or behavioral U-shape occurs when a learner rst learns, then unlearns, and, nally, relearns, some target concept (on the way to success). Within the framew...
John Case, Timo Kötzing
COLT
2010
Springer
13 years 4 months ago
Open Loop Optimistic Planning
We consider the problem of planning in a stochastic and discounted environment with a limited numerical budget. More precisely, we investigate strategies exploring the set of poss...
Sébastien Bubeck, Rémi Munos