Sciweavers

295 search results - page 2 / 59
» Invariants Based Failure Diagnosis in Distributed Computing ...
Sort
View
ICAC
2005
IEEE
13 years 10 months ago
Distributed Troubleshooting Agents
Key issues to address in autonomic job recovery for cluster computing are recognizing job failure; understanding the failure sufficiently to know if and how to restart the job; an...
Charles Earl, Emilio Remolina, Jim Ong, John Brown
AGENTS
1997
Springer
13 years 9 months ago
Distributed Diagnosis by Vivid Agents
Many systems, such as large manufacturing systems, telecommunication networks, or homeautomation systems, require distributed monitoring and diagnosis. In this article, we introdu...
Michael Schroeder, Gerd Wagner
HPDC
2008
IEEE
13 years 11 months ago
Issues in applying data mining to grid job failure detection and diagnosis
As grid computation systems become larger and more complex, manually diagnosing failures in jobs becomes impractical. Recently, machine-learning techniques have been proposed to d...
Lakshmikant Shrinivas, Jeffrey F. Naughton
INFOCOM
2012
IEEE
11 years 7 months ago
Sherlock is around: Detecting network failures with local evidence fusion
—Traditional approaches for wireless sensor network diagnosis are mainly sink-based. They actively collect global evidences from sensor nodes to the sink so as to conduct central...
Qiang Ma, Kebin Liu, Xin Miao, Yunhao Liu
ICAC
2008
IEEE
13 years 11 months ago
Guided Problem Diagnosis through Active Learning
There is widespread interest today in developing tools that can diagnose the cause of a system failure accurately and efficiently based on monitoring data collected from the syst...
Songyun Duan, Shivnath Babu