Overlapping communication with computation is an important optimization on current cluster architectures; its importance is likely to increase as the doubling of processing power ...
Wei-Yu Chen, Dan Bonachea, Costin Iancu, Katherine...
Medusa [3, 6] is a distributed stream processing system based on the Aurora single-site stream processing engine [1]. We demonstrate how Medusa handles time-varying load spikes an...
Magdalena Balazinska, Hari Balakrishnan, Michael S...
We present a low-cost, decentralized algorithm for ID management in distributed hash tables (DHTs) managed by a dynamic set of hosts. Each host is assigned an ID in the unit inter...
— One of the central problems for data quality is inconsistency detection. Given a database D and a set Σ of dependencies as data quality rules, we want to identify tuples in D ...
— Massive data analysis on large clusters presents new opportunities and challenges for query optimization. Data partitioning is crucial to performance in this environment. Howev...