Sciweavers

Share
warning: Creating default object from empty value in /var/www/modules/taxonomy/taxonomy.module on line 1416.
COLT
2010
Springer
10 years 2 months ago
An Asymptotically Optimal Bandit Algorithm for Bounded Support Models
Multiarmed bandit problem is a typical example of a dilemma between exploration and exploitation in reinforcement learning. This problem is expressed as a model of a gambler playi...
Junya Honda, Akimichi Takemura
books