We present a simulation-based performance model to analyze a parallel sparse LU factorization algorithm on modern cached-based, high-end parallel architectures. We consider supern...
Three parallel algorithms for solving the 3D problem with nonlocal boundary condition are considered. The forward and backward Euler finite-difference schemes, and LOD scheme are t...
We describe two novel constructs for programming parallel machines with multi-level memory hierarchies: call-up, which allows a child task to invoke computation on its parent, and...
Michael Bauer, John Clark, Eric Schkufza, Alex Aik...
Three pointer-based parallel join algorithms are presented and analyzed for environments in which secondary storage is made transparent to the programmer through memory mapping. B...
Peter A. Buhr, Anil K. Goel, Naomi Nishimura, Prab...
—We describe parallel methods for solving large-scale, high-dimensional, sparse least-squares problems that arise in machine learning applications such as document classificatio...