Using Linux for high-performance applications on the compute nodes of IBM Blue Gene/P is challenging because of TLB misses and difficulties with programming the network DMA engine...
Kazutomo Yoshii, Kamil Iskra, Harish Naik, Pete Be...
The increasing complexity of hardware features for recent processors makes high performance code generation very challenging. In particular, several optimization targets have to b...
We present the Stack Trace Analysis Tool (STAT) to aid in debugging extreme-scale applications. STAT can reduce problem exploration spaces from thousands of processes to a few by ...
Dorian C. Arnold, Dong H. Ahn, Bronis R. de Supins...
This paper presents a case study in the generic design of Grid component models. It defines a framework allowing two component systems, one running in a CCA environment, and anoth...
Recently, graphics processing units (GPUs) are providing increasingly higher performance with programmable internal processors, namely vertex processors (VPs) and fragment process...