Sciweavers

656 search results - page 59 / 132
» Scalable Parallel Matrix Multiplication on Distributed Memor...
Sort
View
IPPS
1996
IEEE
15 years 1 months ago
An Element-Based Concurrent Partitioner for Unstructured Finite Element Meshes
A concurrent partitioner for partitioning unstructured finite element meshes on distributed memory architectures is developed. The partitioner uses an element-based partitioning st...
Hong Q. Ding, Robert D. Ferraro
IPPS
2003
IEEE
15 years 3 months ago
Optimizing Synchronization Operations for Remote Memory Communication Systems
Synchronization operations, such as fence and locking, are used in many parallel operations accessing shared memory. However, a process which is blocked waiting for a fence operat...
Darius Buntinas, Amina Saify, Dhabaleswar K. Panda...
CLUSTER
2007
IEEE
15 years 4 months ago
The design of MPI based distributed shared memory systems to support OpenMP on clusters
— OpenMP can be supported in cluster environments by using distributed shared memory (DSM) systems. A portable approach for building DSM systems is to layer it on MPI. With these...
H'sien J. Wong, Alistair P. Rendell
ICCS
2009
Springer
15 years 4 months ago
Generating Empirically Optimized Composed Matrix Kernels from MATLAB Prototypes
The development of optimized codes is time-consuming and requires extensive architecture, compiler, and language expertise, therefore, computational scientists are often forced to ...
Boyana Norris, Albert Hartono, Elizabeth R. Jessup...
IPPS
2007
IEEE
15 years 4 months ago
Domain Decomposition vs. Master-Slave in Apparently Homogeneous Systems
This paper investigates the utilization of the master-slave (MS) paradigm as an alternative to domain decomposition (DD) methods for parallelizing lattice gauge theory (LGT) model...
Cyril Banino-Rokkones