Intuitively, aggressive work sharing among concurrent queries in a database system should always improve performance by eliminating redundant computation or data accesses. We show...
Ryan Johnson, Nikos Hardavellas, Ippokratis Pandis...
Global address space languages like UPC exhibit high performance and portability on a broad class of shared and distributed memory parallel architectures. The most scalable applic...
With billion-transistor chips on the horizon, single-chip multiprocessors (CMPs) are likely to become commodity components. Speculative CMPs use hardware to enforce dependence, al...
Troy A. Johnson, Rudolf Eigenmann, T. N. Vijaykuma...
Dynamically allocating computing nodes to parallel applications is a promising technique for improving the utilization of cluster resources. Detailed simulations can help identify...
Basile Schaeli, Sebastian Gerlach, Roger D. Hersch
HPC programmers utilize tracefiles, which record program behavior in great detail, as the basis for many performance analysis activities. The lack of generally accessible tracefil...
Ken Ferschweiler, Scott Harrah, Dylan Keon, Mariac...