In this paper, we present a methodology for profiling parallel applications executing on the IBM PowerXCell 8i (commonly referred to as the “Cell” processor). Specifically, we...
Hikmet Dursun, Kevin J. Barker, Darren J. Kerbyson...
Hybrid chip multithreaded SMPs present new challenges as well as new opportunities to maximize performance. Our intention is to discover the optimal operating configuration of suc...
The excessive complexity of both machine architectures and applications have made it difficult for compilers to statically model and predict application behavior. This observatio...
Qing Yi, Keith Seymour, Haihang You, Richard W. Vu...
Abstract. Prefetching transfers a data item in advance from its storage location to its usage location so that communication is hidden and does not delay computation. We present a ...
Michael Klemm, Jean Christophe Beyler, Ronny T. La...
Recent research on processor microarchitecture suggests using instruction criticality as a metric to guide hardware control policies. Fields et al. [3, 4] have proposed a directed...