Process skew is an important factor in the performance of parallel applications, especially in large-scale clusters. Reduction is a common collective operation which, by its natur...
Adam Wagner, Darius Buntinas, Dhabaleswar K. Panda...
To construct a large commodity clustec a hierarchical network is generally adopted for connecting the host muchines, where a Gigabit backbone switch connects a few commodity switc...
In a machine that follows the dynamically trace scheduled VLIW (DTSVLIW) architecture, VLIW instructions are built dynamically through an algorithm that can be implemented in hard...
This paper describes a technique called producer-push that enhances the performance of a page-based software distributed shared memory system. Shared data, in software DSM systems...
In this paper we propose the OPTNET, a novel optical network and associated coherence protocol for scalable multiprocessors. The network divides its channels into broadcast and po...