Sciweavers

40 search results - page 7 / 8
» Parallel Dense Gauss-Seidel Algorithm on Many-Core Processor...
Sort
View
TPDS
2010
174views more  TPDS 2010»
13 years 3 months ago
Parallel Two-Sided Matrix Reduction to Band Bidiagonal Form on Multicore Architectures
The objective of this paper is to extend, in the context of multicore architectures, the concepts of tile algorithms [Buttari et al., 2007] for Cholesky, LU, QR factorizations to t...
Hatem Ltaief, Jakub Kurzak, Jack Dongarra
PPOPP
2010
ACM
14 years 2 months ago
Scaling LAPACK panel operations using parallel cache assignment
In LAPACK many matrix operations are cast as block algorithms which iteratively process a panel using an unblocked algorithm and then update a remainder matrix using the high perf...
Anthony M. Castaldo, R. Clint Whaley
EH
2004
IEEE
163views Hardware» more  EH 2004»
13 years 9 months ago
Towards Evolvable Analog Artificial Neural Networks Controllers
This work deals with the design of analog circuits for Artificial Neural Networks (ANNs) controllers using an Evolvable Hardware (EHW) platform. ANNs are massively parallel system...
José Franco Machado do Amaral, Jorge Lu&iac...
ASPLOS
2009
ACM
14 years 6 months ago
QR decomposition on GPUs
QR decomposition is a computationally intensive linear algebra operation that factors a matrix A into the product of a unitary matrix Q and upper triangular matrix R. Adaptive sys...
Andrew Kerr, Dan Campbell, Mark Richards
IEEEPACT
1999
IEEE
13 years 9 months ago
Memory System Support for Image Processing
Image processing applications tend to access their data non-sequentially and reuse that data infrequently. As a result, they tend to perform poorly on conventional memory systems ...
Lixin Zhang, John B. Carter, Wilson C. Hsieh, Sall...