We propose, and justify, an economic theory to guide memory system design, operation, and analysis. Our theory treats memory random-access latency, and its cost per installed mega...
The goal of this paper is to gain insight into the relative performance of communication mechanisms as bisection bandwidth and network latency vary. We compare shared memory with ...
Frederic T. Chong, Rajeev Barua, Fredrik Dahlgren,...
Widespread adaptation of shared memory programming for High Performance Computing has been inhibited by a lack of standardization and the resulting portability problems between pl...
Efficient intra-node shared memory communication is important for High Performance Computing (HPC), especially with the emergence of multi-core architectures. As clusters continue ...
In this work, to render at least 5123 voxel volumes in real-time, we have developed a sort-last parallel volume rendering method for distributed memory multiprocessors. Our sort-l...