This paper presents a new approach to hardware division—the parallel paths algorithm. In this approach, prescaling allows the division recurrence to be implemented by three proc...
Abstract. Profiling is often the method of choice for performance analysis of parallel applications due to its low overhead and easily comprehensible results. However, a disadvanta...
The linear systems associated with large, sparse, symmetric, positive definite matrices are often solved iteratively using the preconditioned conjugate gradient method. We have d...
The objective of this paper is to extend, in the context of multicore architectures, the concepts of tile algorithms [Buttari et al., 2007] for Cholesky, LU, QR factorizations to t...
We introduce Scioto, Shared Collections of Task Objects, a lightweight framework for providing task management on distributed memory machines under one-sided and globalview parall...
James Dinan, Sriram Krishnamoorthy, D. Brian Larki...