One purpose of the end-user tools described in this paper is to give users a graphical representation of performance information that has been gathered by instrumenting an applica...
Kevin S. London, Jack Dongarra, Shirley Moore, Phi...
Shared-memory multi-threaded programming is inherently more difficult than single-threaded programming. The main source of complexity is that, the threads of an application can in...
We present a unified approach to locality optimization that employs both data and control transformations. Data transformations include changing the array layout in memory. Contr...
Complete system simulation to understand the influence of architecture and operating systems on application execution has been identified to be crucial for systems design. While t...
Tao Li, Lizy Kurian John, Narayanan Vijaykrishnan,...
We present an AC1 (logDCFL) algorithm for checking LTL formulas over finite paths, thus establishing that the problem can be efficiently parallelized. Our construction provides a f...