Sciweavers

436 search results - page 71 / 88
» Parallel Program Development and Execution in the Grid
Sort
View
IPPS
1996
IEEE
15 years 1 months ago
ECO: Efficient Collective Operations for Communication on Heterogeneous Networks
PVM and other distributed computing systems have enabled the use of networks of workstations for parallel computation, but their approach of treating all networks as collections o...
Bruce Lowekamp, Adam Beguelin
CGO
2010
IEEE
15 years 4 months ago
Parameterized tiling revisited
Tiling, a key transformation for optimizing programs, has been widely studied in literature. Parameterized tiled code is important for auto-tuning systems since they often execute...
Muthu Manikandan Baskaran, Albert Hartono, Sanket ...
FPL
2006
Springer
242views Hardware» more  FPL 2006»
15 years 1 months ago
TMD-MPI: An MPI Implementation for Multiple Processors Across Multiple FPGAs
With current FPGAs, designers can now instantiate several embedded processors, memory units, and a wide variety of IP blocks to build a single-chip, high-performance multiprocesso...
Manuel Saldaña, Paul Chow
ASPLOS
2009
ACM
15 years 10 months ago
QR decomposition on GPUs
QR decomposition is a computationally intensive linear algebra operation that factors a matrix A into the product of a unitary matrix Q and upper triangular matrix R. Adaptive sys...
Andrew Kerr, Dan Campbell, Mark Richards
CLUSTER
2003
IEEE
15 years 2 months ago
Coordinated Checkpoint versus Message Log for Fault Tolerant MPI
— Large Clusters, high availability clusters and Grid deployments often suffer from network, node or operating system faults and thus require the use of fault tolerant programmin...
Aurelien Bouteiller, Pierre Lemarinier, Gér...