Sciweavers

46 search results - page 4 / 10
» An SNMP based failure detection service
Sort
View
GPC
2007
Springer
15 years 4 months ago
Fault Management in P2P-MPI
We present in this paper the recent developments done in P2P-MPI, a grid middleware, concerning the fault management, which covers fault-tolerance for applications and fault detect...
Stéphane Genaud, Choopan Rattanapoka
HPDC
1998
IEEE
15 years 2 months ago
A Fault Detection Service for Wide Area Distributed Computations
The potential for faults in distributed computing systems is a significant complicating factor for application developers. While a variety of techniques exist for detecting and co...
Paul Stelling, Ian T. Foster, Carl Kesselman, Crai...
APIN
2008
108views more  APIN 2008»
14 years 10 months ago
Achieving self-healing in service delivery software systems by means of case-based reasoning
Abstract Self-healing, i.e. the capability of a system to autonomously detect failures and recover from them, is a very attractive property that may enable large-scale software sys...
Stefania Montani, Cosimo Anglano
HPDC
2006
IEEE
15 years 4 months ago
Replicating Nondeterministic Services on Grid Environments
Replication is a technique commonly used to increase the availability of services in distributed systems, including grid and web services. While replication is relatively easy for...
Xianan Zhang, Flavio Junqueira, Matti A. Hiltunen,...
PODC
1998
ACM
15 years 2 months ago
Probabilistic Byzantine Quorum Systems
ÐIn this paper, we explore techniques to detect Byzantine server failures in asynchronous replicated data services. Our goal is to detect arbitrary failures of data servers in a s...
Dahlia Malkhi, Michael K. Reiter, Avishai Wool, Re...