Sciweavers

200 search results - page 38 / 40
» Point-Based Policy Iteration
Sort
View
MOBIHOC
2007
ACM
15 years 9 months ago
Distributed opportunistic scheduling for ad-hoc communications: an optimal stopping approach
We consider distributed opportunistic scheduling (DOS) in wireless ad-hoc networks, where many links contend for the same channel using random access. In such networks, distribute...
Dong Zheng, Weiyan Ge, Junshan Zhang
COLT
2008
Springer
14 years 11 months ago
Adapting to a Changing Environment: the Brownian Restless Bandits
In the multi-armed bandit (MAB) problem there are k distributions associated with the rewards of playing each of k strategies (slot machine arms). The reward distributions are ini...
Aleksandrs Slivkins, Eli Upfal
NIPS
1996
14 years 10 months ago
Multidimensional Triangulation and Interpolation for Reinforcement Learning
Dynamic Programming, Q-learning and other discrete Markov Decision Process solvers can be applied to continuous d-dimensional state-spaces by quantizing the state space into an arr...
Scott Davies
WWW
2010
ACM
15 years 4 months ago
Privacy wizards for social networking sites
Privacy is an enormous problem in online social networking sites. While sites such as Facebook allow users fine-grained control over who can see their profiles, it is difficult ...
Lujun Fang, Kristen LeFevre
77
Voted
ICRA
2008
IEEE
167views Robotics» more  ICRA 2008»
15 years 3 months ago
An approximate algorithm for solving oracular POMDPs
Abstract— We propose a new approximate algorithm, LAJIV (Lookahead J-MDP Information Value), to solve Oracular Partially Observable Markov Decision Problems (OPOMDPs), a special ...
Nicholas Armstrong-Crews, Manuela M. Veloso