This paper describes the principles of an original adaptive interconnect for a computational cluster. Torus topology (2d or 3d) is used as a basis but nodes are allowed to effecti...
Scale-out approach, in contrast to scale-up approach (exploring increasing performance by utilizing more powerful shared-memory servers), refers to deployment of applications on a...
While improving raw performance is of primary interest to most users of high-performance computers, energy consumption also is a critical concern. Some microprocessors allow volta...
This paper describes a compiler for stream programs that efficiently schedules computational kernels and stream memory operations, and allocates on-chip storage. Our compiler uses...
In recent years, several approaches have been proposed to use profile information in compiler optimization. This profile information can be used at the source level to guide loo...
Masayo Haneda, Peter M. W. Knijnenburg, Harry A. G...