Timely and cost-effective processing of large datasets has become a critical ingredient for the success of many academic, government, and industrial organizations. The combination...
Abstract—The distributed nature and large scale of MapReduce programs and systems poses two challenges in using existing profiling and debugging tools to understand MapReduce pr...
Widespread adaptation of shared memory programming for High Performance Computing has been inhibited by a lack of standardization and the resulting portability problems between pl...
The problem of counting specified combinations of a given set of variables arises in many statistical and data mining applications. To solve this problem, we introduce the PDtree...
Chad Scherrer, Nathaniel Beagley, Jarek Nieplocha,...
: Computational grids that couple geographically distributed resources are becoming the de-facto computing platform for solving large-scale problems in science, engineering, and co...