Sciweavers

SAC
2006
ACM

Combining supervised and unsupervised monitoring for fault detection in distributed computing systems

13 years 4 months ago
Combining supervised and unsupervised monitoring for fault detection in distributed computing systems
Fast and accurate fault detection is becoming an essential component of management software for mission critical systems. A good fault detector makes possible to initiate repair actions quickly, increasing the availability of the system. The contribution of this paper is twofold. First a new concept of supervised and unsupervised monitoring is proposed for system fault detection. We use a statistical method, canonical correlation analysis (CCA), to model the contextual dependencies between system inputs u and internal behavior x. By means of CCA, the space x is transformed into two subsets of variables, which are monitored in a supervised and unsupervised manner respectively. By doing so, our approach can reduce the false alarms resulting from unusual workload changes, and hence achieve high fault detection rate. Second, in order to test the performance of our approach, we simulate a variety of system faults in a real e-commerce application based on the multi-tiered J2EE architecture....
Haifeng Chen, Guofei Jiang, Cristian Ungureanu, Ke
Added 15 Dec 2010
Updated 15 Dec 2010
Type Journal
Year 2006
Where SAC
Authors Haifeng Chen, Guofei Jiang, Cristian Ungureanu, Kenji Yoshihira
Comments (0)