Workstation clusters are becoming an interesting alternative to dedicated multiprocessors. In this environment, the probability of a failure, during an application's executio...
— Large Clusters, high availability clusters and Grid deployments often suffer from network, node or operating system faults and thus require the use of fault tolerant programmin...
Fault tolerance is a very important concern for critical high performance applications using the MPI library. Several protocols provide automatic and transparent fault detection a...
Pierre Lemarinier, Aurelien Bouteiller, Thomas H&e...
We have created Zap, a novel system for transparent migration of legacy and networked applications. Zap provides a thin virtualization layer on top of the operating system that in...
Steven Osman, Dinesh Subhraveti, Gong Su, Jason Ni...
As users interact with the world and their peers through their computers, it is becoming important to archive and later search the information that they have viewed. We present De...
Oren Laadan, Ricardo A. Baratto, Dan B. Phung, Sha...