Sciweavers

922 search results - page 115 / 185
» The design of a task parallel library
Sort
View
HPDC
2010
IEEE
15 years 2 months ago
Scalability of communicators and groups in MPI
As the number of cores inside compute clusters continues to grow, the scalability of MPI (Message Passing Interface) is important to ensure that programs can continue to execute o...
Humaira Kamal, Seyed M. Mirtaheri, Alan Wagner
EUROPAR
2009
Springer
15 years 5 months ago
Fast and Efficient Synchronization and Communication Collective Primitives for Dual Cell-Based Blades
The Cell Broadband Engine (Cell BE) is a heterogeneous multi-core processor specifically designed to exploit thread-level parallelism. Its memory model comprehends a common shared ...
Epifanio Gaona, Juan Fernández, Manuel E. A...
PVM
2010
Springer
14 years 12 months ago
Enabling Concurrent Multithreaded MPI Communication on Multicore Petascale Systems
With the ever-increasing numbers of cores per node on HPC systems, applications are increasingly using threads to exploit the shared memory within a node, combined with MPI across ...
Gábor Dózsa, Sameer Kumar, Pavan Bal...
HPDC
2008
IEEE
15 years 8 months ago
StoreGPU: exploiting graphics processing units to accelerate distributed storage systems
Today Graphics Processing Units (GPUs) are a largely underexploited resource on existing desktops and a possible costeffective enhancement to high-performance systems. To date, mo...
Samer Al-Kiswany, Abdullah Gharaibeh, Elizeu Santo...
IEEEPACT
2007
IEEE
15 years 7 months ago
Automatic Correction of Loop Transformations
Loop nest optimization is a combinatorial problem. Due to the growing complexity of modern architectures, it involves two increasingly difficult tasks: (1) analyzing the profita...
Nicolas Vasilache, Albert Cohen, Louis-Noël P...