Sciweavers

2232 search results - page 4 / 447
» A Scalable Approach to MPI Application Performance Analysis
Sort
View
FGCS
2008
140views more  FGCS 2008»
13 years 5 months ago
Blocking vs. non-blocking coordinated checkpointing for large-scale fault tolerant MPI Protocols
A long-term trend in high-performance computing is the increasing number of nodes in parallel computing platforms, which entails a higher failure probability. Fault tolerant progr...
Darius Buntinas, Camille Coti, Thomas Hérau...
EUROPAR
2009
Springer
14 years 9 days ago
Process Mapping for MPI Collective Communications
It is an important problem to map virtual parallel processes to physical processors (or cores) in an optimized way to get scalable performance due to non-uniform communication cost...
Jin Zhang, Jidong Zhai, Wenguang Chen, Weimin Zhen...
PDP
2003
IEEE
13 years 11 months ago
Performance Modeling of Scientific Applications: Scalability Analysis of LAPW0
This paper presents a high-level approach for assessing the performance behavior of complex scientific applications running on a high-performance system through simulation. The pr...
Thomas Fahringer, Nicola Mazzocca, Massimiliano Ra...
HIPC
2007
Springer
13 years 12 months ago
A Scalable Asynchronous Replication-Based Strategy for Fault Tolerant MPI Applications
As computational clusters increase in size, their mean-time-to-failure reduces. Typically checkpointing is used to minimize the loss of computation. Most checkpointing techniques, ...
John Paul Walters, Vipin Chaudhary
CCGRID
2006
IEEE
13 years 11 months ago
Proposal of MPI Operation Level Checkpoint/Rollback and One Implementation
With the increasing number of processors in modern HPC(High Performance Computing) systems, there are two emergent problems to solve. One is scalability, the other is fault tolera...
Yuan Tang, Graham E. Fagg, Jack Dongarra