Recently, a number of graph partitioning applications have emerged with additional requirements that the traditional graph partitioning model alone cannot e ectively handle. One s...
Supercomputer performance is highly dependent on its interconnection subsystem design. In this paper we study how di erent architectural approaches for router design impact into s...
The goal of this paper is to gain insight into the relative performance of communication mechanisms as bisection bandwidth and network latency vary. We compare shared memory with ...
Frederic T. Chong, Rajeev Barua, Fredrik Dahlgren,...
In this paper we introduce a page-based Lazy Release Consistency protocol called ADSM that constantly and efficiently adapts to the applications' sharing patterns. Adaptation...
As we look to the future, and the prospect of a billion transistors on a chip, it seems inevitable that microprocessors will exploit having multiple parallel threads. To achieve t...