The performance of irregular applications on modern computer systems is hurt by the wide gap between CPU and memory speeds because these applications typically underutilize multi-...
John M. Mellor-Crummey, David B. Whalley, Ken Kenn...
We demonstrate that data reordering can substantially improve the performance of fine-grained irregular sharedmemory benchmarks, on both hardware and software shared-memory syste...
Because irregular applications have unpredictable memory access patterns, their performance is dominated by memory behavior. The Impulse con gurable memory controller will enable s...
John B. Carter, Wilson C. Hsieh, Mark R. Swanson, ...
Abstract. This paper investigates two types of overhead due to duplicated local computations, which are frequently encountered in the parallel software of overlapping domain decomp...
We describe two novel constructs for programming parallel machines with multi-level memory hierarchies: call-up, which allows a child task to invoke computation on its parent, and...
Michael Bauer, John Clark, Eric Schkufza, Alex Aik...