Sciweavers

7 search results - page 2 / 2
» Online exploration in least-squares policy iteration
Sort
View
MOBIHOC
2007
ACM
15 years 9 months ago
Distributed opportunistic scheduling for ad-hoc communications: an optimal stopping approach
We consider distributed opportunistic scheduling (DOS) in wireless ad-hoc networks, where many links contend for the same channel using random access. In such networks, distribute...
Dong Zheng, Weiyan Ge, Junshan Zhang
COLT
2008
Springer
14 years 11 months ago
Adapting to a Changing Environment: the Brownian Restless Bandits
In the multi-armed bandit (MAB) problem there are k distributions associated with the rewards of playing each of k strategies (slot machine arms). The reward distributions are ini...
Aleksandrs Slivkins, Eli Upfal