We introduce a software/hardware scheme called the Field Array Compression Technique (FACT) which reduces cache misses due to recursive data structures. Using a data layout transfo...
We investigate the parallel implementation of the diagonal{implicitly iterated Runge{ Kutta (DIIRK) method, an iteration method based on a predictor{corrector scheme. This method ...
Abstract—The Sparse Matrix-Vector Multiplication kernel exhibits limited potential for taking advantage of modern shared memory architectures due to its large memory bandwidth re...
Kornilios Kourtis, Georgios I. Goumas, Nectarios K...
We focus on the parallel access of randomly aligned rectangular blocks of visual data. As an alternative of traditional linearly addressable memories, we suggest a memory organizat...
Georgi Kuzmanov, Georgi Gaydadjiev, Stamatis Vassi...
Array remappings are useful to many applications on distributed memory parallel machines. They are available in High Performance Fortran, a Fortran-based data-parallel language. T...