Sciweavers

CORR
2010
Springer
152views Education» more  CORR 2010»
12 years 11 months ago
Combinatorial Network Optimization with Unknown Variables: Multi-Armed Bandits with Linear Rewards
In the classic multi-armed bandits problem, the goal is to have a policy for dynamically operating arms that each yield stochastic rewards with unknown means. The key metric of int...
Yi Gai, Bhaskar Krishnamachari, Rahul Jain