Heterogeneous supercomputers with combined general purpose and accelerated CPUs promise to be the future major architecture due to their wideranging generality and superior perfor...
Loop nest optimization is a combinatorial problem. Due to the growing complexity of modern architectures, it involves two increasingly difficult tasks: (1) analyzing the profita...
—The most pervasive compute operation carried out in almost all bioinformatics applications is pairwise sequence homology detection (or sequence alignment). Due to exponentially ...
In SMT processors, the complex interplay between private and shared datapath resources needs to be considered in order to realize the full performance potential. In this paper, we...
The Fast Fourier Transform (FFT) is of primary importance and a fundamental kernel in many computationally intensive scientific applications. In this paper we investigate its perf...