Sciweavers

1024 search results - page 87 / 205
» Fault Tolerance in Decentralized Systems
Sort
View
IPPS
1998
IEEE
15 years 2 months ago
Fault-Tolerant Message Routing for Multiprocessors
In this paper the problem of fault-tolerant message routing in two-dimensional meshes, with each inner node having 4 neighbors, is investigated. It is assumed that some nodes/links...
Lev Zakrevski, Mark G. Karpovsky
ISCAPDCS
2001
14 years 11 months ago
Tolerating Transient Faults through an Instruction Reissue Mechanism
In this paper, we propose a fault-tolerant mechanism for microprocessors, which detects transient faults and recovers from them. There are two driving force to investigate fault-t...
Toshinori Sato, Itsujiro Arita
SOSP
2009
ACM
15 years 6 months ago
Upright cluster services
The UpRight library seeks to make Byzantine fault tolerance (BFT) a simple and viable alternative to crash fault tolerance for a range of cluster services. We demonstrate UpRight ...
Allen Clement, Manos Kapritsos, Sangmin Lee, Yang ...
ICS
2011
Tsinghua U.
14 years 1 months ago
High performance linpack benchmark: a fault tolerant implementation without checkpointing
The probability that a failure will occur before the end of the computation increases as the number of processors used in a high performance computing application increases. For l...
Teresa Davies, Christer Karlsson, Hui Liu, Chong D...
CLUSTER
2004
IEEE
15 years 1 months ago
Improved message logging versus improved coordinated checkpointing for fault tolerant MPI
Fault tolerance is a very important concern for critical high performance applications using the MPI library. Several protocols provide automatic and transparent fault detection a...
Pierre Lemarinier, Aurelien Bouteiller, Thomas H&e...