– Efficient implementations of the Discrete Fourier Transform (DFT) for GPUs provide good performance with large data sizes, but are not competitive with CPU code for small data ...
Abstract. We discuss the High Performance Fortran data parallel programming language as an aid to software engineering and as a tool for exploiting High Performance Computing syste...
We address the problem of instruction selection for Multi-Output Instructions (MOIs), producing more than one result. Such inherently parallel hardware instructions are very commo...
Background: Genetic association studies have been used to map disease-causing genes. A newly introduced statistical method, called exhaustive haplotype association study, analyzes...
Loop fusion is important to optimizing compilers because it is an important tool in managing the memory hierarchy. By fusing loops that use the same data elements, we can reduce t...