Sciweavers

HIPC
2009
Springer

Continuous performance monitoring for large-scale parallel applications

13 years 2 months ago
Continuous performance monitoring for large-scale parallel applications
Traditional performance analysis techniques are performed after a parallel program has completed. In this paper, we describe an online method for continuously monitoring the performance of a parallel program, specifically the fraction of the time spent in various activities as the program executes. Our implementation of both a visualization client and the parallel performance framework that gathers utilization data are described. The data gathering uses a scalable and asynchronous reduction with an appropriate lossless compressed data format. The overheads in the initial system are low, even when run on thousands of processors. The data gathering occurs in an out-ofband communication mechanism, interleaving itself transparently with the execution of the parallel application by leveraging a message-driven runtime system. I. CONTINUOUS PERFORMANCE MONITORING A. Importance of Continuous Performance Monitoring Postmortem performance monitoring is the norm in parallel computing. In this com...
Isaac Dooley, Chee Wai Lee, Laxmikant V. Kal&eacut
Added 18 Feb 2011
Updated 18 Feb 2011
Type Journal
Year 2009
Where HIPC
Authors Isaac Dooley, Chee Wai Lee, Laxmikant V. Kalé
Comments (0)