We collected file system content data from 857 desktop computers at Microsoft over a span of 4 weeks. We analyzed the data to determine the relative efficacy of data deduplication...
There are several pieces of information that can be utilized in order to improve the efficiency of similarity searches on high-dimensional data. The most commonly used information...
One of the world’s largest scientific data systems, NASA’s Earth Observing System Data and Information System (EOSDIS) has stored over three petabytes of earth science data in...
Jeanne Behnke, Tonjua Hines Watts, Ben Kobler, Daw...
The cloud is poised to become the next computing environment for both data storage and computation due to its pay-as-you-go and provision-as-you-go models. Cloud storage is alread...
Kiran-Kumar Muniswamy-Reddy, Peter Macko, Margo I....
We pose the question: how do we efficiently evaluate a join operator, distributed over a heterogeneous network? Our objective here is to optimize the delay of output tuples. We di...