Sciweavers

7 search results - page 2 / 2
» Online exploration in least-squares policy iteration
Sort
View
MOBIHOC
2007
ACM
14 years 5 months ago
Distributed opportunistic scheduling for ad-hoc communications: an optimal stopping approach
We consider distributed opportunistic scheduling (DOS) in wireless ad-hoc networks, where many links contend for the same channel using random access. In such networks, distribute...
Dong Zheng, Weiyan Ge, Junshan Zhang
COLT
2008
Springer
13 years 7 months ago
Adapting to a Changing Environment: the Brownian Restless Bandits
In the multi-armed bandit (MAB) problem there are k distributions associated with the rewards of playing each of k strategies (slot machine arms). The reward distributions are ini...
Aleksandrs Slivkins, Eli Upfal