Sciweavers

PC
2007
183views Management» more  PC 2007»
13 years 4 months ago
Exploring weak scalability for FEM calculations on a GPU-enhanced cluster
Dominik Göddeke, Robert Strzodka, Jamaludin M...
PC
2007
284views Management» more  PC 2007»
13 years 4 months ago
Implementation and evaluation of shared-memory communication and synchronization operations in MPICH2 using the Nemesis communic
This paper presents the implementation of MPICH2 over the Nemesis communication subsystem and the evaluation of its shared-memory performance. We describe design issues as well as...
Darius Buntinas, Guillaume Mercier, William Gropp
PC
2007
128views Management» more  PC 2007»
13 years 4 months ago
Optimizing a conjugate gradient solver with non-blocking collective operations
This paper presents a case study about the applicability and usage of non blocking collective operations. These operations provide the ability to overlap communication with computa...
Torsten Hoefler, Peter Gottschling, Andrew Lumsdai...
PC
2007
99views Management» more  PC 2007»
13 years 4 months ago
High-performance computing using accelerators
Wu-chun Feng, Dinesh Manocha
PC
2007
123views Management» more  PC 2007»
13 years 4 months ago
MPI collective algorithm selection and quadtree encoding
Abstract. In this paper, we focus on MPI collective algorithm selection process and explore the applicability of the quadtree encoding method to this problem. During the algorithm ...
Jelena Pjesivac-Grbovic, George Bosilca, Graham E....
PC
2007
161views Management» more  PC 2007»
13 years 4 months ago
High performance combinatorial algorithm design on the Cell Broadband Engine processor
The Sony–Toshiba–IBM Cell Broadband Engine (Cell/B.E.) is a heterogeneous multicore architecture that consists of a traditional microprocessor (PPE) with eight SIMD co-process...
David A. Bader, Virat Agarwal, Kamesh Madduri, Seu...
PC
2007
343views Management» more  PC 2007»
13 years 4 months ago
Runtime scheduling of dynamic parallelism on accelerator-based multi-core systems
We explore runtime mechanisms and policies for scheduling dynamic multi-grain parallelism on heterogeneous multi-core processors. Heterogeneous multi-core processors integrate con...
Filip Blagojevic, Dimitrios S. Nikolopoulos, Alexa...
PC
2007
147views Management» more  PC 2007»
13 years 4 months ago
Thread-safety in an MPI implementation: Requirements and analysis
The MPI-2 Standard has carefully specified the interaction between MPI and usercreated threads. The goal of this specification is to allow users to write multithreaded MPI progr...
William Gropp, Rajeev Thakur
PC
2007
133views Management» more  PC 2007»
13 years 4 months ago
Data distribution for dense factorization on computers with memory heterogeneity
In this paper, we study the problem of optimal matrix partitioning for parallel dense factorization on heterogeneous processors. First, we outline existing algorithms solving the ...
Alexey L. Lastovetsky, Ravi Reddy