Failures of any type are common in current datacenters, partly due to the higher scales of the data stored. As data scales up, its availability becomes more complex, while differe...
Nicolas Bonvin, Thanasis G. Papaioannou, Karl Aber...
We present a parallel data processor centered around a programming model of so called Parallelization Contracts (PACTs) and the scalable parallel execution engine Nephele [18]. Th...
Many modern enterprises are collecting data at the most detailed level possible, creating data repositories ranging from terabytes to petabytes in size. The ability to apply sophi...
Sudipto Das, Yannis Sismanis, Kevin S. Beyer, Rain...
This paper presents a general methodology for the efficient parallelization of existing data cube construction algorithms. We describe two different partitioning strategies, one f...
Frank K. H. A. Dehne, Todd Eavis, Susanne E. Hambr...
Bulk loading refers to the process of creating an index from scratch for a given data set. This problem is well understood for B-trees, but so far, non-traditional index structure...