Sciweavers

39 search results - page 5 / 8
» colt 2010
Sort
View
130
Voted
COLT
2010
Springer
14 years 10 months ago
Optimal Algorithms for Online Convex Optimization with Multi-Point Bandit Feedback
Bandit convex optimization is a special case of online convex optimization with partial information. In this setting, a player attempts to minimize a sequence of adversarially gen...
Alekh Agarwal, Ofer Dekel, Lin Xiao
COLT
2010
Springer
14 years 10 months ago
An Asymptotically Optimal Bandit Algorithm for Bounded Support Models
Multiarmed bandit problem is a typical example of a dilemma between exploration and exploitation in reinforcement learning. This problem is expressed as a model of a gambler playi...
Junya Honda, Akimichi Takemura
107
Voted
COLT
2010
Springer
14 years 10 months ago
Convex Games in Banach Spaces
We study the regret of an online learner playing a multi-round game in a Banach space B against an adversary that plays a convex function at each round. We characterize the minima...
Karthik Sridharan, Ambuj Tewari
87
Voted
COLT
2010
Springer
14 years 10 months ago
Strongly Non-U-Shaped Learning Results by General Techniques
In learning, a semantic or behavioral U-shape occurs when a learner rst learns, then unlearns, and, nally, relearns, some target concept (on the way to success). Within the framew...
John Case, Timo Kötzing
104
Voted
COLT
2010
Springer
14 years 10 months ago
Open Loop Optimistic Planning
We consider the problem of planning in a stochastic and discounted environment with a limited numerical budget. More precisely, we investigate strategies exploring the set of poss...
Sébastien Bubeck, Rémi Munos