: In scalable parallel machines, processors can make local memory accesses much faster than they can make remote memory accesses. In addition, when a number of remote accesses must...
In this paper we propose a design technique to pipeline cache memories for high bandwidth applications. With the scaling of technology cache access latencies are multiple clock cy...
The Cray MTA-2 system provides exceptional performance on a variety of sparse graph algorithms. Unfortunately, it was an extremely expensive platform. Cray is preparing an Eldorad...
Keith D. Underwood, Megan Vance, Jonathan W. Berry...
We focus on the parallel access of randomly aligned rectangular blocks of visual data. As an alternative of traditional linearly addressable memories, we suggest a memory organizat...
Georgi Kuzmanov, Georgi Gaydadjiev, Stamatis Vassi...
VIRAM (Vector Intelligent Random Access Memory) is a vector architecture processor with embedded memory, designed for portable multimedia processing devices. Its vector processing...
Thinh P. Q. Nguyen, Avideh Zakhor, Katherine A. Ye...