The demand for more computational power in science and engineering has spurred the design and deployment of ever-growing cluster systems. Even though the individual components use...
Distributed storage systems often use data replication to mask failures and guarantee high data availability. Node failures can be transient or permanent. While the system must ge...
Jing Tian, Zhi Yang, Wei Chen, Ben Y. Zhao, Yafei ...
Process migration has been used to perform specialized tasks, such as load sharing and checkpoint/restarting long running applications. Implementation typically consists of modifi...
Grid applications comprise several components and web-services that make them highly prone to the occurrence of transient software failures and aging problems. This type of failur...
Modern internet and telephone switches consist of numerous VLSI-circuits operating at high frequencies to handle high bandwidths. It is beyond question that such systems must cont...