Sciweavers

300 search results - page 12 / 60
» A System Recovery Benchmark for Clusters
Sort
View
105
Voted
PVM
2005
Springer
15 years 6 months ago
Scalable Fault Tolerant MPI: Extending the Recovery Algorithm
ct Fault Tolerant MPI (FT-MPI)[6] was designed as a solution to allow applications different methods to handle process failures beyond simple check-point restart schemes. The init...
Graham E. Fagg, Thara Angskun, George Bosilca, Jel...
102
Voted
IPPS
2003
IEEE
15 years 5 months ago
Recovery Schemes for High Availability and High Performance Distributed Real-Time Computing
Clusters and distributed systems offer fault tolerance and high performance through load sharing, and are thus attractive in real-time applications. When all computers are up and ...
Lars Lundberg, Daniel Häggander, Kamilla Klon...
100
Voted
CLUSTER
2003
IEEE
15 years 5 months ago
Communication Middleware Systems for Heterogenous Clusters: A Comparative Study
This paper presents a comparative study of the communication middleware systems suitable for aggregating computational clusters with heterogeneous incompatible SANs into a common ...
Daniel Balkanski, Mario Trams, Wolfgang Rehm
99
Voted
ISPDC
2003
IEEE
15 years 5 months ago
Lightweight Logging and Recovery for Distributed Shared Memory over Virtual Interface Architecture
As software Distributed Shared Memory(DSM) systems become attractive on larger clusters, the focus of attention moves toward improving the reliability of systems. In this paper, w...
Soyeon Park, Youngjae Kim, Seung Ryoul Maeng
CSE
2011
IEEE
14 years 7 days ago
Performance Modeling of Hybrid MPI/OpenMP Scientific Applications on Large-scale Multicore Cluster Systems
In this paper, we present a performance modeling framework based on memory bandwidth contention time and a parameterized communication model to predict the performance of OpenMP, M...
Xingfu Wu, Valerie E. Taylor