A large and increasing gap exists between processor and memory speeds in scalable cache-coherent multiprocessors. To cope with this situation, programmers and compiler writers mus...
In recent years, microprocessor manufacturers have shifted their focus from single-core to multi-core processors. To avoid burdening programmers with the responsibility of paralle...
Neil Vachharajani, Ram Rangan, Easwaran Raman, Mat...
We describe the Paraflow system for connecting heterogeneous computing services together into a flexible and efficient data-mining metacomputer. There are three levels of parallel...
Communication overheads are one of the fundamental challenges in a multiprocessor system. As the number of processors on a chip increases, communication overheads and the distribu...
Katherine E. Coons, Behnam Robatmili, Matthew E. T...
This paper presents an automated performance tuning solution, which partitions a program into a number of tuning sections and finds the best combination of compiler options for e...