Modulo scheduling is an e cient technique for exploiting instruction level parallelism in a variety of loops, resulting in high performance code but increased register requirement...
Alexandre E. Eichenberger, Edward S. Davidson, San...
In 2002, Japan announced the Earth Simulator—a supercomputer based on low-volume vector processors and a custom network—and reported that computational scientists had used it ...
Abstract. The data redistribution problems on multi-computers had been extensively studied. Irregular data redistribution has been paid attention recently since it can distribute d...
This paper presents language features for High Performance Fortran HPF to specify non-local access patterns of distributed arrays, called halos, and to control the communication as...
The NPAC kernel runtime, developed in the PCRC Parallel Compiler Runtime Consortium project, is a runtime library with special support for the High Performance Fortran data model....
Bryan Carpenter, Geoffrey Fox, Donald Leskiw, Xiao...