Time skewing and loop tiling has been known for a long time to be a highly beneficial acceleration technique for nested loops especially on bandwidth hungry multi-core processors...
We describe two novel constructs for programming parallel machines with multi-level memory hierarchies: call-up, which allows a child task to invoke computation on its parent, and...
Michael Bauer, John Clark, Eric Schkufza, Alex Aik...
Performance modeling for scientific applications is important for assessing potential application performance and systems procurement in high-performance computing (HPC). Recent ...
We propose a cooperative methodology for multithreaded software, where threads use traditional synchronization idioms such as locks, but additionally document each point of potent...
Developing parallel software using current tools can be challenging. Even experts find it difficult to reason about the use of locks and often accidentally introduce race condit...
James Christopher Jenista, Yong Hun Eom, Brian Dem...