Sciweavers

WOSS
2004
ACM

Combining statistical monitoring and predictable recovery for self-management

13 years 10 months ago
Combining statistical monitoring and predictable recovery for self-management
Complex distributed Internet services form the basis not only of e-commerce but increasingly of mission-critical networkbased applications. What is new is that the workload and internal architecture of three-tier enterprise applications presents the opportunity for a new approach to keeping them running in the face of many common recoverable failures. The core of the approach is anomaly detection and localization based on statistical machine learning techniques. Unlike previous approaches, we propose anomaly detection and pattern mining not only for operational statistics such as mean response time, but also for structural behaviors of the system—what parts of the system, in what combinations, are being exercised in response to different kinds of external stimuli. In addition, rather than building baseline models a priori, we extract them by observing the behavior of the system over a short period of time during normal operation. We explain the necessary underlying assumptions and ...
Armando Fox, Emre Kiciman, David A. Patterson
Added 30 Jun 2010
Updated 30 Jun 2010
Type Conference
Year 2004
Where WOSS
Authors Armando Fox, Emre Kiciman, David A. Patterson
Comments (0)