Sciweavers

27 search results - page 5 / 6
» Parallel Cholesky Factorization of a Block Tridiagonal Matri...
Sort
View
EUROPAR
2005
Springer
15 years 3 months ago
Automatic Tuning of PDGEMM Towards Optimal Performance
Sophisticated parallel matrix multiplication algorithms like PDGEMM exhibit a complex structure and can be controlled by a large set of parameters including blocking factors and bl...
Sascha Hunold, Thomas Rauber
HPCC
2007
Springer
15 years 3 months ago
A Block JRS Algorithm for Highly Parallel Computation of SVDs
This paper presents a new algorithm for computing the singular value decomposition (SVD) on multilevel memory hierarchy architectures. This algorithm is based on one-sided JRS iter...
Mostafa I. Soliman, Sanguthevar Rajasekaran, Reda ...
72
Voted
ICPPW
2002
IEEE
15 years 2 months ago
A Programming Methodology for Designing Block Recursive Algorithms on Various Computer Networks
In this paper, we use the tensor product notation as the framework of a programming methodology for designing block recursive algorithms on various computer networks. In our previ...
Min-Hsuan Fan, Chua-Huang Huang, Yeh-Ching Chung
PPOPP
2010
ACM
15 years 6 months ago
Scaling LAPACK panel operations using parallel cache assignment
In LAPACK many matrix operations are cast as block algorithms which iteratively process a panel using an unblocked algorithm and then update a remainder matrix using the high perf...
Anthony M. Castaldo, R. Clint Whaley
ICPP
2002
IEEE
15 years 2 months ago
Analysis of Memory Hierarchy Performance of Block Data Layout
Recently, several experimental studies have been conducted on block data layout as a data transformation technique used in conjunction with tiling to improve cache performance. In...
Neungsoo Park, Bo Hong, Viktor K. Prasanna