The growing computational and storage needs of several scientific applications mandate the deployment of extreme-scale parallel machines, such as IBM’s BlueGene/L which can acc...
This paper is structured as follows. Section 2 gives an architectural description of BlueGene/L. Section 3 analyzes the issue of “computational noise” – the effect that the o...
Kei Davis, Adolfy Hoisie, Greg Johnson, Darren J. ...
Frequent failures are becoming a serious concern to the community of high-end computing, especially when the applications and the underlying systems rapidly grow in size and compl...
The demand for more computational power in science and engineering has spurred the design and deployment of ever-growing cluster systems. Even though the individual components use...
— Frequent failure occurrences are becoming a serious concern to the community of high-end computing, especially when the applications and the underlying systems rapidly grow in ...