Sciweavers

10 search results - page 2 / 2
» Dense linear algebra solvers for multicore with GPU accelera...
Sort
View
CORR
2008
Springer
162views Education» more  CORR 2008»
13 years 5 months ago
Accelerating Scientific Computations with Mixed Precision Algorithms
On modern architectures, the performance of 32-bit operations is often at least twice as fast as the performance of 64-bit operations. By using a combination of 32-bit and 64-bit ...
Marc Baboulin, Alfredo Buttari, Jack Dongarra, Jak...
PDP
2010
IEEE
14 years 5 days ago
Experimental Study of Six Different Implementations of Parallel Matrix Multiplication on Heterogeneous Computational Clusters of
—Two strategies of distribution of computations can be used to implement parallel solvers for dense linear algebra problems for Heterogeneous Computational Clusters of Multicore ...
Pedro Alonso, Ravi Reddy, Alexey L. Lastovetsky
EUROPAR
2009
Springer
13 years 12 months ago
An Extension of the StarSs Programming Model for Platforms with Multiple GPUs
While general-purpose homogeneous multi-core architectures are becoming ubiquitous, there are clear indications that, for a number of important applications, a better performance/p...
Eduard Ayguadé, Rosa M. Badia, Francisco D....
ASPLOS
2009
ACM
14 years 6 months ago
QR decomposition on GPUs
QR decomposition is a computationally intensive linear algebra operation that factors a matrix A into the product of a unitary matrix Q and upper triangular matrix R. Adaptive sys...
Andrew Kerr, Dan Campbell, Mark Richards
PPOPP
2010
ACM
14 years 2 months ago
Scaling LAPACK panel operations using parallel cache assignment
In LAPACK many matrix operations are cast as block algorithms which iteratively process a panel using an unblocked algorithm and then update a remainder matrix using the high perf...
Anthony M. Castaldo, R. Clint Whaley