Moore’s Law and the drive towards performance efficiency have led to the on-chip integration of general-purpose cores with special-purpose accelerators. Pangaea is a heterogeneo...
Henry Wong, Anne Bracy, Ethan Schuchman, Tor M. Aa...
Modulo scheduling is an effective code generation technique that exploits the parallelism in program loops by overlapping iterations. One drawback of this optimization is that reg...
Modern transactional response-time sensitive applications have run into practical limits on the size of garbage collected heaps. The heap can only grow until GC pauses exceed the ...
Performance simulation tools must be validated during the design process as functional models and early hardware are developed, so that designers can be sure of the performance of...
Supercomputer architects strive to maximize the performance of scientific applications. Unfortunately, the large, unwieldy nature of most scientific applications has lead to the...
Richard C. Murphy, Arun Rodrigues, Peter M. Kogge,...