In this paper, we describe a set of compiler analyses and an implementation that automatically map a sequential and un-annotated C program into a pipelined implementation, targete...
Data decomposition is probably the most successful method for generating parallel programs. In this paper a general framework is described for the automatic generation of parallel...
Edwin M. R. M. Paalvast, Henk J. Sips, Arjan J. C....
We show how, given a program and its separation logic proof, one can parallelize and optimize this program and transform its proof simultaneously to obtain a proven parallelized an...
Extensible processors allow addition of application-specific custom instructions to the core instruction set architecture. These custom instructions are selected through an analys...
The recent shift in the industry towards chip multiprocessor (CMP) designs has brought the need for multi-threaded applications to mainstream computing. As observed in several lim...