The efficient implementation of multimedia algorithms, for the ever increasing complexity of the specifications and the emergence of the new generation of processing platforms c...
Christophe Lucarz, Marco Mattavelli, Julien Dubois
The most intuitive memory consistency model for shared-memory multi-threaded programming is sequential consistency (SC). However, current concurrent programming languages support ...
Daniel Marino, Abhayendra Singh, Todd D. Millstein...
Abstract. The DEFACTO project - a Design Environment For Adaptive Computing TechnOlogy - is a system that maps computations, expressed in high-level languages such as C, directly o...
Pedro C. Diniz, Mary W. Hall, Joonseok Park, Byoun...
Software-controlled data prefetching is a promising technique for improving the performance of the memory subsystem to match today's high-performance processors. While prefet...
On machines with high-performance processors, the memory system continues to be a performance bottleneck. Compilers insert prefetch operations and reorder data accesses to improve...
Nathaniel McIntosh, Sandya Mannarswamy, Robert Hun...