Distributed systems based on cluster of workstation are more and more difficult to manage due to the increasing number of processors involved, and the complexity of associated appl...
This paper proposes a new method to detect abnormal process state. The method is based on cluster center point monitoring in time and is demonstrated in its application to data fro...
Debugging the performance of parallel and distributed systems remains a difficult task despite the widespread use of middleware packages for automatic distribution, communication...
Abstract. Performance analysis for terascale computing requires a combination of new concepts including distribution, on-line processing and automation. As a foundation for tools r...
We present the Ka-admin project that addresses the problem of collecting, visualizing and feeding back any grid information, trace or snapshot, compliant to an XML-like model. Rea...