One of the key decisions made by both MapReduce and HPC cluster management frameworks is the placement of jobs within a cluster. To make this decision, they consider factors like ...
National labs, academic institutions and industry have a strong need for scientists and staff that understand high performance computing (HPC) and the complex interconnections ac...
We present an architecture for high-performance computers that integrates in situ analysis of hardware and system monitoring data with application-specific data to reduce applica...
Today’s rapid development of supercomputers has caused I/O performance to become a major performance bottleneck for many scientific applications. Trace analysis tools have thus...
Xiaoqing Luo, Frank Mueller, Philip H. Carns, John...
Migrating resources is a useful tool for balancing load in a distributed system, but it is difficult to determine when to move resources, where to move resources, and how much of ...
Michael A. Sevilla, Noah Watkins, Carlos Maltzahn,...