Ranking distributed probabilistic data

10 years 10 months ago
Ranking distributed probabilistic data
Ranking queries are essential tools to process large amounts of probabilistic data that encode exponentially many possible deterministic instances. In many applications where uncertainty and fuzzy information arise, data are collected from multiple sources in distributed, networked locations, e.g., distributed sensor fields with imprecise measurements, multiple scientific institutes with inconsistency in their scientific data. Due to the network delay and the economic cost associated with communicating large amounts of data over a network, a fundamental problem in these scenarios is to retrieve the global top-k tuples from all distributed sites with minimum communication cost. Using the wellfounded notion of the expected rank of each tuple across all possible worlds as the basis of ranking, this work designs both communication- and computation-efficient algorithms for retrieving the top-k tuples with the smallest ranks from distributed sites. Extensive experiments using both synthetic...
Feifei Li, Ke Yi, Jeffrey Jestes
Added 05 Dec 2009
Updated 05 Dec 2009
Type Conference
Year 2009
Authors Feifei Li, Ke Yi, Jeffrey Jestes
Comments (0)