Sciweavers

7 search results - page 2 / 2
» On the myopic policy for a class of restless bandit problems...
Sort
View
CORR
2010
Springer
143views Education» more  CORR 2010»
13 years 2 months ago
The Non-Bayesian Restless Multi-Armed Bandit: a Case of Near-Logarithmic Regret
In the classic Bayesian restless multi-armed bandit (RMAB) problem, there are N arms, with rewards on all arms evolving at each time as Markov chains with known parameters. A play...
Wenhan Dai, Yi Gai, Bhaskar Krishnamachari, Qing Z...
WIOPT
2011
IEEE
12 years 9 months ago
Network utility maximization over partially observable Markovian channels
Abstract—This paper considers maximizing throughput utility in a multi-user network with partially observable Markov ON/OFF channels. Instantaneous channel states are never known...
Chih-Ping Li, Michael J. Neely