Sciweavers

295 search results - page 3 / 59
» Invariants Based Failure Diagnosis in Distributed Computing ...
Sort
View
ASPLOS
2010
ACM
14 years 1 days ago
SherLog: error diagnosis by connecting clues from run-time logs
Computer systems often fail due to many factors such as software bugs or administrator errors. Diagnosing such production run failures is an important but challenging task since i...
Ding Yuan, Haohui Mai, Weiwei Xiong, Lin Tan, Yuan...
ATC
2006
Springer
13 years 9 months ago
Multi-level Model-Based Self-diagnosis of Distributed Object-Oriented Systems
Self-healing relies on correct diagnosis of system malfunctioning. This paper presents a use-case based approach to self-diagnosis. Both a static and a dynamic model of a managed-s...
A. Reza Haydarlou, Benno J. Overeinder, Michel A. ...
ATAL
2007
Springer
13 years 11 months ago
Diagnosis of plan step errors and plan structure violations
Failures in plan execution can be attributed to errors in the execution of plan steps or violations of the plan structure. The structure of a plan prescribes which actions have to...
Cees Witteveen, Nico Roos, Adriaan ter Mors, Xiaoy...
IPPS
2009
IEEE
13 years 12 months ago
Robust sequential resource allocation in heterogeneous distributed systems with random compute node failures
—The problem of finding efficient workload distribution techniques is becoming increasingly important today for heterogeneous distributed systems where the availability of comp...
Vladimir Shestak, Edwin K. P. Chong, Anthony A. Ma...
TPDS
2010
135views more  TPDS 2010»
13 years 3 months ago
Maximizing Service Reliability in Distributed Computing Systems with Random Node Failures: Theory and Implementation
—In distributed computing systems (DCSs) where server nodes can fail permanently with nonzero probability, the system performance can be assessed by means of the service reliabil...
Jorge E. Pezoa, Sagar Dhakal, Majeed M. Hayat