Abstract— We developed an automated environment to measure the memory access behavior of applications on high performance clusters. Code optimization for processor caches is cruc...
The paper presents a new data partitioning algorithm for parallel computing on heterogeneous processors. Like traditional functional partitioning algorithms, the algorithm assumes ...
There is a rapidly growing set of applications, referred to as data driven applications, in which analysis of large amounts of data drives the next steps taken by the scientist, e...
— Large Clusters, high availability clusters and Grid deployments often suffer from network, node or operating system faults and thus require the use of fault tolerant programmin...
Due to the complexity associated with developing parallel applications, scientists and engineers rely on highlevel software libraries such as PETSc, ScaLAPACK and PESSL to ease th...
Pavan Balaji, Darius Buntinas, Satish Balay, Barry...