The use of multiprocessor tasks (M-tasks) has been shown to be successful for mixed task and data parallel implementations of algorithms from scientific computing. The approach o...
Abstract. SKaMPI is a benchmark for MPI implementations. Its purpose is the detailed analysis of the runtime of individual MPI operations and comparison of these for di erent imple...
Ralf Reussner, Peter Sanders, Lutz Prechelt, Matth...
Partial Redundancy Elimination PRE is a general scheme for suppressing partial redundancies which encompasses traditional optimizations like loop invariant code motion and redun...
As the gap between processor and memory continues to grow, memory performance becomes a key performance bottleneck for many applications. Compilers therefore increasingly seek to m...
Sandya S. Mannarswamy, Ramaswamy Govindarajan, Ris...
Abstract. Clocks are a mechanism for providing synchronization barriers in concurrent programming languages. They are usually implemented using primitive communication mechanisms a...