This paper describes a signal recognition system that is jointly optimized from mathematical representation, algorithm design and final implementation. The goal is to exploit sign...
Melina Demertzi, Pedro C. Diniz, Mary W. Hall, Ann...
To increase the scale and performance of scientific applications, scientists commonly distribute computation over multiple processors. Often without realizing it, file I/O is pa...
Seetharami R. Seelam, Andre Kerstens, Patricia J. ...
With the increasing gap between processor speed and memory latency, the performance of data-dominated programs are becoming more reliant on fast data access, which can be improved...
Short vector SIMD instructions on recent microprocessors, such as SSE on Pentium III and 4, speed up code but are a major challenge to software developers. We present a compiler t...
We study in this paper the problem of polyhedron scanning which appears for example when generating code for transformed loop nests in automatic parallelization. After a review of...