— This paper focuses on the transfer of large data in SMP systems. Achieving good performance for intranode communication is critical for developing an efficient communication s...
The designs of high-performance processor architectures are moving toward the integration of a large number of multiple processing cores on a single chip. The IBM Cyclops-64 (C64)...
With the latest high-end computing nodes combining shared-memory multiprocessing with hardware multithreading, new scheduling policies are necessary for workloads consisting of mu...
Robert L. McGregor, Christos D. Antonopoulos, Dimi...
The problem of writing high performance parallel applications becomes even more challenging when irregular, sparse or adaptive methods are employed. In this paper we introduce com...
We propose a low-overhead sampling infrastructure for gathering information from the executions experienced by a program’s user community. Several example applications illustrat...
Ben Liblit, Alexander Aiken, Alice X. Zheng, Micha...