An emerging trend in processor design is the addition of short vector instructions to general-purpose and embedded ISAs. Frequently, these extensions are employed using traditiona...
Samuel Larsen, Rodric M. Rabbah, Saman P. Amarasin...
This paper presents a novel optimizing compiler for general purpose computation on graphics processing units (GPGPU). It addresses two major challenges of developing high performa...
Abstract. Most studies on adaptive partitioning policies for scheduling parallel jobs on distributed memory parallel computers ignore the constraints imposed by the memory requirem...
This paper presents a hardware-optimized variant of the well-known Gaussian elimination over GF(2) and its highly efficient implementation. The proposed hardware architecture, we...
Motivated by applications to sensor, peer-to-peer, and adhoc networks, we study the problem of computing functions of values at the nodes in a network in a totally distributed man...