Sciweavers

222

COLT
2010
Springer

207views Machine Learning» more COLT 2010»

An Asymptotically Optimal Bandit Algorithm for Bounded Support Models

15 years 5 months ago

Download www.colt2010.org

Multiarmed bandit problem is a typical example of a dilemma between exploration and exploitation in reinforcement learning. This problem is expressed as a model of a gambler playi...

Junya Honda, Akimichi Takemura

claim paper

Read More »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers