Sciweavers

146 search results - page 30 / 30
» Transparent Checkpoint-Restart of Distributed Applications o...
Sort
View
ICS
2007
Tsinghua U.
13 years 11 months ago
Proactive fault tolerance for HPC with Xen virtualization
Large-scale parallel computing is relying increasingly on clusters with thousands of processors. At such large counts of compute nodes, faults are becoming common place. Current t...
Arun Babu Nagarajan, Frank Mueller, Christian Enge...