In order to extract high levels of performance from modern parallel architectures, the effective management of deep memory hierarchies is very important. While architectural advan...
Mahmut T. Kandemir, Alok N. Choudhary, J. Ramanuja...
In this paper we present the internal representation and optimizations used by the CASH compiler for improving the memory parallelism of pointer-based programs. CASH uses an SSA-b...
The MPI programming model hides network type and topology from developers, but also allows them to seamlessly distribute a computational job across multiple cores in both an intra ...
: This is a position paper introducing blob computing: A Blob is a generic primitive used to structure a uniform computing substrate into an easier-to-program parallel virtual mach...
Having a representative workload of the target domain of a microprocessor is extremely important throughout its design. The composition of a workload involves two issues: (i) whic...
Lieven Eeckhout, Hans Vandierendonck, Koenraad De ...