A key capability of data-race detectors is to determine whether one thread executes logically in parallel with another or whether the threads must operate in series. This paper pr...
Michael A. Bender, Jeremy T. Fineman, Seth Gilbert...
We propose a low-overhead sampling infrastructure for gathering information from the executions experienced by a program’s user community. Several example applications illustrat...
Ben Liblit, Alexander Aiken, Alice X. Zheng, Micha...
In an intelligent memory architecture, the main memory of a computer is enhanced with many simple processors. The result is a highly-parallel, heterogeneous machine that is able t...
Basilio B. Fraguela, Jose Renau, Paul Feautrier, D...
When using a shared memory multiprocessor, the programmer faces the selection of the portable programming model which will deliver the best performance. Even if he restricts his c...
Parallelizing compiler technology has improved in recent years. One area in which compilers have made progress is in handling DOACROSS loops, where crossprocessor data dependencie...