The Cell Broadband Engine (Cell BE) is a heterogeneous multi-core processor specifically designed to exploit thread-level parallelism. Its memory model comprehends a common shared ...
Abstract. Definitions for the uniform representation of d-dimensional matrices serially in Morton-order (or Z-order) support both their use with cartesian indices, and their divide...
Memory system bottlenecks limit performance for many applications, and computations with strided access patterns are among the hardest hit. The streams used in such applications h...
Heterogeneous parallel systems incorporate diverse models of parallelism within a single machine or across machines and are better suited for diverse applications 25, 43, 30]. Thes...
Kathryn S. McKinley, Sharad Singhai, Glen E. Weave...
To prepare for future peta- or exa-scale computing, it is important to gain a good understanding on what impacts a hierarchical storage system would have on the performance of data...
Weikuan Yu, Sarp Oral, Shane Canon, Jeffrey S. Vet...