Cray X1 Fortran and C/C++ compilers provide a number of loop transformations, notably vectorization and multistreaming, in order to exploit the multistreaming processor (MSP) hard...
Mapping computations written in high-level programming languages to FPGA-based computing engines requires programmers to generate the datapath responsible for the core of the comp...
The Merrimac supercomputer uses stream processors and a highradix network to achieve high performance at low cost and low power. The stream architecture matches the capabilities o...
Mattan Erez, Jung Ho Ahn, Ankit Garg, William J. D...
We comparethe performance of software-supported shared memory on a general-purpose network to hardware-supported shared memory on a dedicated interconnect. Up to eight processors,...
Alan L. Cox, Sandhya Dwarkadas, Peter J. Keleher, ...
Abstract—Convergence of communication, consumer applications and computing within mobile systems pushes memory requirements both in terms of size, bandwidth and power consumption...
Marco Facchini, Trevor Carlson, Anselme Vignon, Ma...