Sciweavers

68 search results - page 3 / 14
» Failure Detection with Booting in Partially Synchronous Syst...
Sort
View
SSS
2007
Springer
13 years 11 months ago
Global Predicate Detection in Distributed Systems with Small Faults
Abstract. We study the problem of global predicate detection in presence of permanent and transient failures. We term the transient failures as small faults. We show that it is imp...
Felix C. Freiling, Arshad Jhumka
FM
2009
Springer
134views Formal Methods» more  FM 2009»
13 years 3 months ago
Partial Order Reductions Using Compositional Confluence Detection
Abstract. Explicit state methods have proven useful in verifying safetycritical systems containing concurrent processes that run asynchronously and communicate. Such methods consis...
Frédéric Lang, Radu Mateescu
PODC
2009
ACM
13 years 10 months ago
The weakest failure detector for solving k-set agreement
A failure detector is a distributed oracle that provides processes in a distributed system with hints about failures. The notion of a weakest failure detector captures the exact a...
Eli Gafni, Petr Kuznetsov
ICDCS
2012
IEEE
11 years 7 months ago
Combining Partial Redundancy and Checkpointing for HPC
Today’s largest High Performance Computing (HPC) systems exceed one Petaflops (1015 floating point operations per second) and exascale systems are projected within seven years...
James Elliott, Kishor Kharbas, David Fiala, Frank ...
TC
2002
13 years 5 months ago
Fast Asynchronous Uniform Consensus in Real-Time Distributed Systems
We investigate whether asynchronous computational models and asynchronous algorithms can be considered for designing real-time distributed fault-tolerant systems. A priori, the lac...
Jean-François Hermant, Gérard Le Lan...