Sciweavers

234 search results - page 17 / 47
» Implementation of Fault-Tolerant GridRPC Applications
Sort
View
DSN
2003
IEEE
15 years 3 months ago
Engineering Fault-Tolerant TCP/IP Servers Using FT-TCP
In a recent paper [2] we have proposed FT-TCP: an architecture that allows a replicated service to survive crashes without breaking its TCP connections. FT-TCP is attractive in pr...
Dmitrii Zagorodnov, Keith Marzullo, Lorenzo Alvisi...
HPCA
2003
IEEE
15 years 10 months ago
Dynamic Data Replication: An Approach to Providing Fault-Tolerant Shared Memory Clusters
A challenging issue in today's server systems is to transparently deal with failures and application-imposed requirements for continuous operation. In this paper we address t...
Rosalia Christodoulopoulou, Reza Azimi, Angelos Bi...
DEXAW
2004
IEEE
132views Database» more  DEXAW 2004»
15 years 1 months ago
Using Data-Flow Analysis for Resilience and Result Checking in Peer-To-Peer Computations
To achieve correct execution of peer-to-peer applications on non-reliable resources, we present a portable and distributed algorithm that provides fault tolerance and result checki...
Samir Jafar, Sébastien Varrette, Jean-Louis...
FGCS
2008
140views more  FGCS 2008»
14 years 10 months ago
Blocking vs. non-blocking coordinated checkpointing for large-scale fault tolerant MPI Protocols
A long-term trend in high-performance computing is the increasing number of nodes in parallel computing platforms, which entails a higher failure probability. Fault tolerant progr...
Darius Buntinas, Camille Coti, Thomas Hérau...
IPPS
2007
IEEE
15 years 4 months ago
A Fault Tolerance Protocol with Fast Fault Recovery
Fault tolerance is an important issue for large machines with tens or hundreds of thousands of processors. Checkpoint-based methods, currently used on most machines, rollback all ...
Sayantan Chakravorty, Laxmikant V. Kalé