Abstract. Traditional parallel programming methodologies for improving performance assume cache-based parallel systems. However, new architectures, like the IBM Cyclops-64 (C64), b...
Elkin Garcia, Ioannis E. Venetis, Rishi Khan, Guan...
A recent development in radio astronomy is to replace traditional dishes with many small antennas. The signals are combined to form one large, virtual telescope. The enormous data ...
This paper presents CMP Cooperative Caching, a unified framework to manage a CMP’s aggregate on-chip cache resources. Cooperative caching combines the strengths of private and ...
Modern chip-level multiprocessors (CMPs) contain multiple processor cores sharing a common last-level cache, memory interconnects, and other hardware resources. Workloads running ...
Richard West, Puneet Zaroo, Carl A. Waldspurger, X...
A programmable vector processor and its implementation with a field-programmable gate array (FPGA) are presented. This processor is composed of a vector core and a tightly couple...