Abstract. We show a two-phase approach for minimizing various communication-cost metrics in fine-grain partitioning of sparse matrices for parallel processing. In the first phase...
Long-Running transactions consist of tasks which may be executed sequentially and in parallel, may contain sub-tasks, and may require to be completed before a deadline. These trans...
Ruggero Lanotte, Andrea Maggiolo-Schettini, Paolo ...
Optimal network performance is critical to efficient parallel scaling for communication-bound applications on large machines. With wormhole routing, no-load latencies do not increa...
Abhinav Bhatele, Eric J. Bohm, Laxmikant V. Kal&ea...
The overhead of copying data through the central processor by a message passing protocol limits data transfer bandwidth. If the network interface directly transfers the user'...
Hiroshi Tezuka, Francis O'Carroll, Atsushi Hori, Y...
In a hardware transactional memory system with lazy versioning and lazy conflict detection, the process of transaction commit can emerge as a bottleneck. This is especially true ...
Seth H. Pugsley, Manu Awasthi, Niti Madan, Naveen ...