Sciweavers

EDBT
2011
ACM

RanKloud: a scalable ranked query processing framework on hadoop

12 years 7 months ago
RanKloud: a scalable ranked query processing framework on hadoop
The popularity of batch-oriented cluster architectures like Hadoop is on the rise. These batch-based systems successfully achieve high degrees of scalability by carefully allocating resources and leveraging opportunities to parallelize basic processing tasks. However, they are known to fall short in certain application domains such as large scale media analysis. In these applications, the utility of a given data element plays a vital role in a particular analysis task, and this utility most often depends on the way the data is collected or interpreted. However, existing batch data processing frameworks do not consider data utility in allocating resources, and hence fail to optimize for ranked/top-k query processing in which the user is interested in obtaining a relatively small subset of the best result instances. A na¨ıve implementation of these operations on an existing system would need to enumerate more candidates than needed, before it can filter out the k best results. We not...
K. Selçuk Candan, Parth Nagarkar, Mithila N
Added 27 Aug 2011
Updated 27 Aug 2011
Type Journal
Year 2011
Where EDBT
Authors K. Selçuk Candan, Parth Nagarkar, Mithila Nagendra, Renwei Yu
Comments (0)