In this paper we consider several hardware implementations of the general-purpose atomic primitives fetch and Φ, compare and swap, load linked, and store conditionalon large-scal...
On a distributed memory machine, hand-coded message passing leads to the most efficient execution, but it is difficult to use. Parallelizing compilers can approach the performance...
We evaluate the e ect of processor speed, network bandwidth, and software overhead on the performance of release-consistent software distributed shared memory. We examine ve di er...
Sandhya Dwarkadas, Peter J. Keleher, Alan L. Cox, ...
Most recovery schemes that have been proposed for Distributed Shared Memory (DSM) systems require unnecessarily high checkpointing frequency and checkpoint traffic, which are sens...
Despite a large research effort, software distributed shared memory systems have not been widely used to run parallel applications across clusters of computers. The higher perform...