Abstract. We present PerfMiner, a system for the transparent collection, storage and presentation of thread-level hardware performance data across an entire cluster. Every sub-proc...
Philip Mucci, Daniel Ahlin, Johan Danielsson, Per ...
Clusters of workstations are becoming popular platforms for parallel computing, but performance on these systems is more complex and harder to predict than on traditional parallel...
Geetanjali Sampemane, Scott Pakin, Andrew A. Chien
In this paper, we present a structure for monitoring a large set of computational clusters. We illustrate methods for scaling a monitor network comprised of many clusters while ke...
Federico D. Sacerdoti, Mason J. Katz, Matthew L. M...
Supermon is a flexible set of tools for high speed, scalable cluster monitoring. Node behavior can be monitored much faster than with other commonly used methods (e.g., rstatd). ...
Debugging the performance of parallel and distributed systems remains a difficult task despite the widespread use of middleware packages for automatic distribution, communication...