The performance of irregular applications on modern computer systems is hurt by the wide gap between CPU and memory speeds because these applications typically underutilize multi-...
John M. Mellor-Crummey, David B. Whalley, Ken Kenn...
With the advent of large scale heterogeneous environments, there is a need for matching and scheduling algorithms which can allow multiple DAG-structured applications to share the...
In this paper, the effect of switch design on the application performance of cache-coherent non-uniform memory access (CC-NUMA) multiprocessors is studied in detail. Wormhole rout...
Laxmi N. Bhuyan, Hu-Jun Wang, Ravi R. Iyer, Akhile...
We analyze the performance of greedy routing for array networks by providing bounds on the average delay and the average number of packets in the system for the dynamic routing pr...
Emulators that translate algorithms from the shared-memory model to two different message-passing models are presented. Both are achieved by implementing a wait-free, atomic, singl...