Abstract— We developed an automated environment to measure the memory access behavior of applications on high performance clusters. Code optimization for processor caches is cruc...
This paper presents a parallel version for the Propagation Algorithm which belongs to the region growing family of algorithms. The main goal of our implementation is to decrease de...
Since embedded systems require ever more compute power, SMT processors are viable candidates for future high performance embedded processors. However, SMTs exhibit unpredictable pe...
Francisco J. Cazorla, Peter M. W. Knijnenburg, Riz...
In this paper we use a mathematical approach to automatically generate high performance short vector code for the discrete Fourier transform (DFT). We represent the well-known Coo...
The standardisation procedure of the IEEE P1394.1 Draft Standard for High Performance Serial Bus Bridges is supported through the use of the state-of-the-art model checker Spin, w...