Efficient partitioning of parallel loops plays a critical role in high performance and efficient use of multiprocessor systems. Although a significant amount of work has been don...
Arun Kejariwal, Alexandru Nicolau, Utpal Banerjee,...
Reconfigurable computing (RC) applications employing both microprocessors and FPGAs have potential for large speedup when compared with traditional (software) parallel application...
Abstract. Traditional parallel programming methodologies for improving performance assume cache-based parallel systems. However, new architectures, like the IBM Cyclops-64 (C64), b...
Elkin Garcia, Ioannis E. Venetis, Rishi Khan, Guan...
The performance of many modern computer and communication systems is dictated by the latency of communication pipelines. At the same time, power/energy consumption is often another...
Network-on-Chip (NoC) architectures have a wide variety of parameters that can be adapted to the designer’s requirements. Fast exploration of this parameter space is only possib...