The locality of the data in parallel programs is known to have a strong impact on the performance of distributed-memory multiprocessor systems. The worse the locality in access pa...
Xinmin Tian, Shashank S. Nemawarkar, Guang R. Gao,...
This paper presents reduction recognition and parallel code generationstrategies for distributed-memorymultiprocessors. We describe techniques to recognize a broad range of implic...
On Chip Multiprocessors (CMP), it is common that multiple cores share certain levels of cache. The sharing increases the contention in cache and memory-to-chip bandwidth, further h...
Yunlian Jiang, Eddy Z. Zhang, Kai Tian, Xipeng She...
New heterogeneous multiprocessor platforms are emerging that are typically composed of loosely coupled components that exchange data using programmable interconnections. The compon...
User-controllable coherence revives the idea of cooperation between software and hardware in an attempt to bridge the gap between efficient small-scale shared memory machines and m...