Sciweavers

ICS
1993
Tsinghua U.

Dynamic Control of Performance Monitoring on Large Scale Parallel Systems

13 years 8 months ago
Dynamic Control of Performance Monitoring on Large Scale Parallel Systems
Performance monitoring of large scale parallel computers creates a dilemma: we need to collect detailed information to find performance bottlenecks, yet collecting all this data can introduce serious data collection bottlenecks. At the same time, users are being inundated with volumes of complex graphs and tables that require a performance expert to interpret. We present a new approach called the W3 Search Model, that addresses both these problems by combining dynamic on-the-fly selection of what performance data to collect with decision support to assist users with the selection and presentation of performance data. We present a case study describing how a prototype implementation of our technique was able to identify the bottlenecks in three real programs. In addition, we were able to reduce the amount of performance data collected by a factor ranging from 13 to 700 compared to traditional sampling and trace based instrumentation techniques.
Jeffrey K. Hollingsworth, Barton P. Miller
Added 09 Aug 2010
Updated 09 Aug 2010
Type Conference
Year 1993
Where ICS
Authors Jeffrey K. Hollingsworth, Barton P. Miller
Comments (0)