Performance analysis tools are critical for the effective use of large parallel computing resources, but existing tools have failed to address three problems that limit their scal...
This paper studies the use of statistical induction techniques as a basis for automated performance diagnosis and performance management. The goal of the work is to develop and ev...
Ira Cohen, Jeffrey S. Chase, Julie Symons, Mois&ea...
The paper presents objectives and results of a series of case studies in computer support for diagnosis, failure mode and effects analysis, and the creation of repair manuals in t...
The ensemble Kalman filter for data assimilation involves the propagation of a collection of ensemble members. Under the assumption of time-sparse measurements, we avoid propagatin...
— The efficient diagnosis of hardware and software faults in parallel and distributed systems remains a challenge in today’s most prolific decentralized environments. System-...