As shared-memory multiprocessors become the dominant commodity source of computation, parallelizing compilers must support mainstream computations that manipulate irregular, point...
This paper describes a proposal for a set of Parallel Basic Linear Algebra Subprograms PBLAS. The PBLAS are targeted at distributed vector-vector, matrix-vector and matrixmatrix...
Jaeyoung Choi, Jack Dongarra, Susan Ostrouchov, An...
Many parallel scientific applications need high-performance I/O. Unfortunately, end-to-end parallel-I/O performance has not been able to keep up with substantial improvements in p...
We investigate the parallel implementation of the diagonal{implicitly iterated Runge{ Kutta (DIIRK) method, an iteration method based on a predictor{corrector scheme. This method ...
External sorting--the process of sorting a file that is too large to fit into the computer's internal memory and must be stored externally on disks--is a fundamental subroutin...