We present a simulation-based performance model to analyze a parallel sparse LU factorization algorithm on modern cached-based, high-end parallel architectures. We consider supern...
Three parallel algorithms for solving the 3D problem with nonlocal boundary condition are considered. The forward and backward Euler finite-difference schemes, and LOD scheme are t...
Three pointer-based parallel join algorithms are presented and analyzed for environments in which secondary storage is made transparent to the programmer through memory mapping. B...
Peter A. Buhr, Anil K. Goel, Naomi Nishimura, Prab...
—We describe parallel methods for solving large-scale, high-dimensional, sparse least-squares problems that arise in machine learning applications such as document classificatio...
With rapid advances in VLSI technology, Field Programmable Gate Arrays (FPGAs) are receiving the attention of the Parallel and High Performance Computing community. In this paper,...
Uday Bondhugula, Ananth Devulapalli, Joseph Fernan...