High-accuracy PDE solvers use multi-dimensional fast Fourier transforms. The FFTs exhibits a static and structured memory access pattern which results in a large amount of communic...
We derive analytically, the performance optimal throttling curve for a processor under thermal constraints for a given task sequence. We found that keeping the chip temperature co...
Scheduling DAGs with communication times is the theoretical basis for achieving efficient parallelism on distributed memory systems. We generalize Graham's task-level in a ma...
This paper presents a methodology to efficiently explore the design space of communication adapters. In most digital signal processing (DSP) applications, the overall performance ...
Cyrille Chavet, Philippe Coussy, Pascal Urard, Eri...
Several forms of processor memory organizations have been in use to optimally access off-chip memory systems mainly the Hard disk drives (HDD). Recent trends show that the solid s...