Tiling is a crucial loop transformation for generating high performance code on modern architectures. Efficient generation of multilevel tiled code is essential for maximizing da...
Albert Hartono, Muthu Manikandan Baskaran, C&eacut...
An optimizing compiler has a hard time to generate a code which will perform at top speed for an arbitrary data set size. In general, the low level optimization process must take i...
This paper describes the implementation of a runtime library for asynchronous communication in the Cell BE processor. The runtime library implementation provides with several servi...
This paper describes a simple scheduling procedure for use in multicast data distribution within a logistical networking infrastructure. The goal of our scheduler is to generate a...
Jason Zurawski, D. Martin Swany, Micah Beck, Ying ...
A key step in program optimization is the determination of optimal values for code optimization parameters such as cache tile sizes and loop unrolling factors. One approach, which...