This paper evaluates the use of per-node multi-threading to hide remote memory and synchronization latencies in a software DSM. As with hardware systems, multi-threading in softwa...
To prepare for future peta- or exa-scale computing, it is important to gain a good understanding on what impacts a hierarchical storage system would have on the performance of data...
Weikuan Yu, Sarp Oral, Shane Canon, Jeffrey S. Vet...
: This paper presents a Data-Distributed Execution approach that exploits interation-level parallelism in loops operating over arrays. It performs data-dependency analysis, based o...
Heterogeneous computing combines general purpose CPUs with accelerators to efficiently execute both sequential control-intensive and data-parallel phases of applications. Existin...
Isaac Gelado, Javier Cabezas, Nacho Navarro, John ...
In this paper, we investigate the data access patterns and file I/O behaviors of a production cosmology application that uses the adaptive mesh refinement (AMR) technique for it...
Jianwei Li, Wei-keng Liao, Alok N. Choudhary, Vale...