UPC’s implicit communication and fine-grain programming style make application performance modeling a challenging task. The correspondence between remote references and communi...
This paper describes the radix-64 folded-Clos network of the Cray BlackWidow scalable vector multiprocessor. We describe the BlackWidow network which scales to 32K processors with...
Steve Scott, Dennis Abts, John Kim, William J. Dal...
In this paper, we present a Dynamic Load Balancing (DLB) policy for problems characterized by a highly irregular search tree, whereby no reliable workload prediction is available....
Growing levels of digitalization and broadband access drives extremely fast progress in multimedia and networking technologies and allows consumers to create requirements at an ac...
We present and evaluate the idea of adaptive processor cache management. Specifically, we describe a novel and general scheme by which we can combine any two cache management alg...
Ranjith Subramanian, Yannis Smaragdakis, Gabriel H...