Tuning parallel applications requires the use of effective tools for detecting performance bottlenecks. Along a parallel program execution, many individual situations of performan...
Load-reuse analysis finds instructions that repeatedly access the same memory location. This location can be promoted to a register, eliminating redundant loads by reusing the re...
We present a scalable temporal order analysis technique that supports debugging of large scale applications by classifying MPI tasks based on their logical program execution order...
Dong H. Ahn, Bronis R. de Supinski, Ignacio Laguna...
Inconsistency between metadata and code customizations is a major concern in modern, configurable enterprise systems. The increasing reliance on metadata, in the form of XML files,...
Memory corruption errors lead to non-deterministic, elusive crashes. This paper describes ARCHER (ARray CHeckER) a static, effective memory access checker. ARCHER uses path-sensit...