This paper presents an architecture for programmable systolic arrays that provides simple and e cient systolic communication. The Brown Systolic Array is a linear implementation o...
We describe a software solution to the problem of automatic parallelization of linear algebra code on multi-processor and multi-core architectures. This solution relies on the defi...
In the case where the search space has a group structure, classical genetic operators (mutation and two-parent crossover) which respect the group action are completely characterize...
Jonathan E. Rowe, Michael D. Vose, Alden H. Wright
We describe parallel implementations of LU factorization with pivoting for multicore architectures. Implementations that differ in two different dimensions are discussed: (1) usin...
Ernie Chan, Robert A. van de Geijn, Andrew Chapman
The Virtual Interface (VI) Architecture provides protected userlevel communication with high delivered bandwidth and low permessage latency, particularly for small messages. The V...