The need to provide performance guarantee in high performance servers has long been neglected. Providing performance guarantee in current and future servers is difficult because ...
Fault tolerance is a very important concern for critical high performance applications using the MPI library. Several protocols provide automatic and transparent fault detection a...
Pierre Lemarinier, Aurelien Bouteiller, Thomas H&e...
Next-generation e-Science applications will require the ability to transfer information at high data rates between distributed computing centers and data repositories. A LambdaGri...
Despite the large research efforts in the SW–DSM community, this technology has not yet been adapted widely for significant codes beyond benchmark suites. One of the reasons co...
With processor speeds no longer doubling every 18-24 months owing to the exponential increase in power consumption and heat dissipation, modern HEC systems tend to rely lesser on ...
Pavan Balaji, Anthony Chan, William Gropp, Rajeev ...