Sciweavers

4 search results - page 1 / 1
» Migration and Rollback Transparency for Arbitrary Distribute...
Sort
View
IPPS
1998
IEEE
13 years 9 months ago
Migration and Rollback Transparency for Arbitrary Distributed Applications in Workstation Clusters
Programmers and users of compute intensive scientific applications often do not want to (or even cannot) code load balancing and fault tolerance into their programs. The PBEAM syst...
Stefan Petri, Matthias Bolz, Horst Langendörf...
SRDS
2003
IEEE
13 years 10 months ago
Raptor: Integrating Checkpoints and Thread Migration for Cluster Management
distributed shared-memory (SDSM) provides the abstraction necessary to run shared-memory applications on cost-effective parallel platforms such as clusters of workstations. Howeve...
Hazim Shafi, Evan Speight, John K. Bennett
CANPC
2000
Springer
13 years 9 months ago
Transparent Network Connectivity in Dynamic Cluster Environments
Improvements in microprocessor and networking performance have made networks of workstations a very attractive platform for high-end parallel and distributed computing. However, t...
Xiaodong Fu, Hua Wang, Vijay Karamcheti
ICS
2007
Tsinghua U.
13 years 11 months ago
Proactive fault tolerance for HPC with Xen virtualization
Large-scale parallel computing is relying increasingly on clusters with thousands of processors. At such large counts of compute nodes, faults are becoming common place. Current t...
Arun Babu Nagarajan, Frank Mueller, Christian Enge...