Data locality is critical to achievinghigh performance on large-scale parallel machines. Non-local data accesses result in communication that can greatly impact performance. Thus ...
Exploiting locality at run-time is a complementary approach to a compiler approach for those applications with dynamic memory access patterns. This paper proposes a memory-layout ...
Abstract. How can we exploit a microprocessor as efficiently as possible? The “classic” approach is static optimization at compile-time, optimizing a program for all possible u...
Kevin Streit, Clemens Hammacher, Andreas Zeller, S...
This paper explores the relation between the structured parallelism exposed by the Decomposable BSP (DBSP) model through submachine locality and locality of reference in multi-lev...
Andrea Pietracaprina, Geppino Pucci, Francesco Sil...
A metascalable (or “design once, scale on new architectures”) parallel computing framework has been developed for large spatiotemporal-scale atomistic simulations of materials...
Ken-ichi Nomura, Richard Seymour, Weiqiang Wang, H...