Fault tolerance is one of the key issues for large scale applications executed on high performance computing systems. In a cluster federation, clusters are gathered to provide hug...
— Large Clusters, high availability clusters and Grid deployments often suffer from network, node or operating system faults and thus require the use of fault tolerant programmin...
Storage and retrieval technology for large-scale medical image systems has matured significantly during the past ten years but many implementations still lack cost-effective backu...
H. K. Huang, Aifeng Zhang, Brent J. Liu, Zheng Zho...
Object-based checkpoints are consistent in the object-based system but may be inconsistent according to the traditional message-based definition. We present a protocol for taking ...
Abstract: The use of meta-level architectures for the implementation of faulttolerant systems is today very appealing. Nevertheless, all existing fault-tolerant systems based on th...