The design of complex Systems-on-Chips implies to take into account communication and memory access constraints for the integration of dedicated hardware accelerator. In this paper...
The cache hierarchy design in existing SMT and superscalar processors is optimized for latency, but not for bandwidth. The size of the L1 data cache did not scale over the past dec...
This paper offers a high-level retrospective overview of the GOPI middleware platform which is the outcome of a three year project aimed at the development of generic, configurabl...
The widening gap between processor and memory performance is the main bottleneck for modern computer systems to achieve high processor utilization. In this paper, we propose a new...
Chun Xue, Zili Shao, Meilin Liu, Mei Kang Qiu, Edw...
This paper describes transparent mechanisms for emulating some of the data distribution facilities offered by traditional data-parallel programming models, such as High Performance...
Dimitrios S. Nikolopoulos, Theodore S. Papatheodor...