Approximate distributed top- k queries

13 years 4 months ago
Approximate distributed top- k queries
We consider a distributed system where each node keeps a local count for items (similar to elections where nodes are ballot boxes and items are candidates). A top-k query in such a system asks which are the k items whose global count, across all nodes in the system, is the largest. In this paper we present a MonteCarlo algorithm that outputs, with high probability, a set of k candidates which approximates the top-k items. The algorithm is motivated by sensor networks in that it focuses on reducing the individual communication complexity. In contrast to previous algorithms, the communication complexity depends only on the global scores and not on the partition of scores among nodes. If the number of nodes is large, our algorithm dramatically reduces the communication complexity when compared with deterministic algorithms. We show that the complexity of our algorithm is close to a lower bound on the cell-probe complexity of any non-interactive top-k approximation algorithm. We show that...
Boaz Patt-Shamir, Allon Shafrir
Added 10 Dec 2010
Updated 10 Dec 2010
Type Journal
Year 2008
Where DC
Authors Boaz Patt-Shamir, Allon Shafrir
Comments (0)