Sciweavers

109 search results - page 18 / 22
» Performance of CAP-Specified Linear Algebra Algorithms
Sort
View
PPOPP
2010
ACM
15 years 6 months ago
Scaling LAPACK panel operations using parallel cache assignment
In LAPACK many matrix operations are cast as block algorithms which iteratively process a panel using an unblocked algorithm and then update a remainder matrix using the high perf...
Anthony M. Castaldo, R. Clint Whaley
CORR
2010
Springer
153views Education» more  CORR 2010»
14 years 9 months ago
Towards an Efficient Tile Matrix Inversion of Symmetric Positive Definite Matrices on Multicore Architectures
The algorithms in the current sequential numerical linear algebra libraries (e.g. LAPACK) do not parallelize well on multicore architectures. A new family of algorithms, the tile a...
Emmanuel Agullo, Henricus Bouwmeester, Jack Dongar...
CONCURRENCY
2007
75views more  CONCURRENCY 2007»
14 years 9 months ago
A distributed packed storage for large dense parallel in-core calculations
We propose in this paper a distributed packed storage format that exploits the symmetry or the triangular structure of a dense matrix. This format stores only half of the matrix w...
Marc Baboulin, Luc Giraud, Serge Gratton, Julien L...
WMPI
2004
ACM
15 years 2 months ago
The Opie compiler from row-major source to Morton-ordered matrices
The Opie Project aims to develop a compiler to transform C codes written for row-major matrix representation into equivalent codes for Morton-order matrix representation, and to a...
Steven T. Gabriel, David S. Wise
IMC
2003
ACM
15 years 2 months ago
Tomography-based overlay network monitoring
Overlay network monitoring enables distributed Internet applications to detect and recover from path outages and periods of degraded performance within seconds. For an overlay net...
Yan Chen, David Bindel, Randy H. Katz