We present a system for describing and solving closed queuing network models of the memory access performance of NUMA architectures. The system consists of a model description lan...
Optimizing the performance of shared-memory NUMA programs remains something of a black art, requiring that application writers possess deep understanding of their programs’ beha...
Multiprocessor memory reference traces provide a wealth of information on the behavior of parallel programs. We have used this information to explore the relationship between kern...
William J. Bolosky, Michael L. Scott, Robert P. Fi...
UPC is an explicit parallel extension of ANSI C, which has been gaining rising attention from vendors and users. In this work, we consider the low-level monitoring and experimenta...