Abstract-- This work aims to pave the way for high availability in high-performance computing (HPC) by focusing on efficient redundancy strategies for head and service nodes. These...
Christian Engelmann, Stephen L. Scott, Chokchai Le...
Purdue University operates one of the largest cycle recovery systems in existence in academia based on the Condor workload management system. This system represents a valuable and...
System-level virtualization is today enjoying a rebirth, after first gaining popularity in the 1970s as a technique to effectively share what were then considered large computin...
networking with a layer 2 abstraction provides a powerful model for virtualized wide-area distributed computing resources, including for high performance computing (HPC) on collec...
Lei Xia, Zheng Cui, John R. Lange, Yuan Tang, Pete...
New static source routing algorithms for High Performance Computing (HPC) are presented in this work. The target parallel architectures are based on the commonly used fattree netw...