Sciweavers

3 search results - page 1 / 1
» Starfish: Fault-Tolerant Dynamic MPI Programs on Clusters of...
Sort
View
HPDC
1999
IEEE
13 years 9 months ago
Starfish: Fault-Tolerant Dynamic MPI Programs on Clusters of Workstations
This paper reports on the architecture and design of Starfish, an environment for executing dynamic (and static) MPI-2 programs on a cluster of workstations. Starfish is unique in ...
Adnan Agbaria, Roy Friedman
HIPC
2009
Springer
13 years 2 months ago
Fast checkpointing by Write Aggregation with Dynamic Buffer and Interleaving on multicore architecture
Large scale compute clusters continue to grow to ever-increasing proportions. However, as clusters and applications continue to grow, the Mean Time Between Failures (MTBF) has redu...
Xiangyong Ouyang, Karthik Gopalakrishnan, Tejus Ga...
IPPS
2000
IEEE
13 years 9 months ago
DyRecT: Software Support for Adaptive Parallelism on NOWs
Abstract. In this paper, we describe DyRecT (Dynamic Reconfiguration Toolkit) a software library that allows programmers to develop adaptively parallel message-passing MPI program...
Etienne Godard, Sanjeev Setia, Elizabeth L. White