Sciweavers

23 search results - page 2 / 5
» A Proposal for a Set of Parallel Basic Linear Algebra Subpro...
Sort
View
IPPS
2008
IEEE
13 years 12 months ago
Build to order linear algebra kernels
—The performance bottleneck for many scientific applications is the cost of memory access inside linear algebra kernels. Tuning such kernels for memory efficiency is a complex ...
Jeremy G. Siek, Ian Karlin, Elizabeth R. Jessup
ARCS
2008
Springer
13 years 7 months ago
An Optimized ZGEMM Implementation for the Cell BE
: The architecture of the IBM Cell BE processor represents a new approach for designing CPUs. The fast execution of legacy software has to stand back in order to achieve very high ...
Timo Schneider, Torsten Hoefler, Simon Wunderlich,...
PPSC
1989
13 years 6 months ago
Evaluating Block Algorithm Variants in LAPACK
The LAPACK software project currently under development is intended to provide a portable linear algebra library for high performance computers. LAPACK will make use of the Level 1...
Ed Anderson, Jack Dongarra
ISPDC
2007
IEEE
13 years 11 months ago
Hybrid MPI-Thread Parallelization of the Fast Multipole Method
We present in this paper multi-thread and multi-process parallelizations of the Fast Multipole Method (FMM) for Laplace equation, for uniform and non uniform distributions. These ...
Olivier Coulaud, Pierre Fortin, Jean Roman
ERSA
2007
86views Hardware» more  ERSA 2007»
13 years 6 months ago
High-Precision BLAS on FPGA-enhanced Computers
The emergence of high-density reconfigurable hardware devices gives scientists and engineers an option to accelerating their numerical computing applications on low-cost but power...
Chuan He, Guan Qin, Richard E. Ewing, Wei Zhao