—Recently, high-end reconfigurable computing systems that employ Field-Programmable Gate Arrays (FPGAs) as hardware accelerators for general-purpose processors have been built. T...
The scalable parallel implementation, targeting SMP and/or multicore architectures, of dense linear algebra libraries is analyzed. Using the LU factorization as a case study, it is...
This paper presents a dynamic task scheduling approach to executing dense linear algebra algorithms on multicore systems (either shared-memory or distributed-memory). We use a tas...
Abstract. We investigate the performance of two approaches for matrix inversion based on Gaussian (LU factorization) and Gauss-Jordan eliminations. The target architecture is a cur...
Peter Benner, Pablo Ezzatti, Enrique S. Quintana-O...
Abstract: If the computational demands of an interactive graphics rendering application cannot be met by a single commodity Graphics Processing Unit (GPU), multiple graphics accele...
Ross Brennan, Michael Manzke, Keith O'Conor, John ...