—Starting from sequential programs, we present an approach combining data reuse, multi-level MapReduce, and pipelining to automatically find the most power-efficient designs th...
Many image and signal processing kernels can be optimized for performance consuming a reasonable area by doing loops parallelization with extensive use of pipelining. This paper p...
Zubair Nawaz, Thomas Marconi, Koen Bertels, Todor ...
Chip multiprocessors designed for streaming applications such as Cell BE offer impressive peak performance but suffer from limited bandwidth to offchip main memory. As the number o...
We address the problem of minimizing power consumption in behavioral synthesis of data-dominated circuits. The complex nature of power as a cost function implies that the effects ...
Abstract. Limited bandwidth to off-chip main memory is a performance bottleneck in chip multiprocessors for streaming computations, such as Cell/B.E., and this will become even mor...