Abstract. We investigate the performance of two approaches for matrix inversion based on Gaussian (LU factorization) and Gauss-Jordan eliminations. The target architecture is a cur...
Peter Benner, Pablo Ezzatti, Enrique S. Quintana-O...
Large and complex systems of ordinary differential equations (ODEs) arise in diverse areas of science and engineering, and pose special challenges on a streaming processor owing to...
Fred V. Lionetti, Andrew D. McCulloch, Scott B. Ba...
We are currently faced with the situation where applications have increasing computational demands and there is a wide selection of parallel processor systems. In this paper we fo...
Frederico Pratas, Pedro Trancoso, Alexandros Stama...
It is well known that LDPC decoding is computationally demanding and one of the hardest signal operations to parallelize. Beyond data dependencies that restrict the decoding of a ...
– Efficient implementations of the Discrete Fourier Transform (DFT) for GPUs provide good performance with large data sizes, but are not competitive with CPU code for small data ...