Efficient performance tuning of parallel programs is often hard. In this paper we describe an approach that uses a uni-processor execution of a multithreaded program as reference ...
The advent of multicores presents a promising opportunity for exploiting fine grained parallelism present in programs. Programs parallelized in the above fashion, typically involv...
Predicting the running time of a parallel program is useful for determining the optimal values for the parameters of the implementation and the optimal mapping of data on processo...
We present a new approach that enables compiler optimization of procedure calls and loop nests containing procedure calls. We introduce two interprocedural transformationsthat mov...
We propose the first polynomial-time code selection algorithm for minimising the worst-case execution time of a nonnested loop executed on a fully pipelined processor that uses sc...