CX, a network-based computational exchange, is presented. The system’s design integrates variations of ideas from other researchers, such as work stealing, non-blocking tasks, e...
This work presents a general methodology for estimating the performance of an HPC workload when running on a future hardware architecture. Further, it demonstrates the methodology...
Ilya Sharapov, Robert Kroeger, Guy Delamarter, Raz...
: A powerful and widely-used method for analyzing the performance behavior of parallel programs is event tracing. When an application is traced, performancerelevant events, such as...
Felix Wolf, Felix Freitag, Bernd Mohr, Shirley Moo...
SCALASCA is a performance toolset that has been specifically designed to analyze parallel application execution behavior on large-scale systems. It offers an incremental performan...
Markus Geimer, Felix Wolf, Brian J. N. Wylie, Erik...
GPU-based heterogeneous clusters continue to draw attention from vendors and HPC users due to their high energy efficiency and much improved single-node computational performance...