Sciweavers

FGCS
2010

Self-healing network for scalable fault-tolerant runtime environments

13 years 2 months ago
Self-healing network for scalable fault-tolerant runtime environments
Scalable and fault tolerant runtime environments are needed to support and adapt to the underlying libraries and hardware which require a high degree of scalability in dynamic large-scale environments. This paper presents a self-healing network (SHN) for supporting scalable and fault-tolerant runtime environments. The SHN is designed to support transmission of messages across multiple nodes while also protecting against recursive node and process failures. It will automatically recover itself after a failure occurs. SHN is implemented on top of a scalable fault-tolerant protocol (SFTP). The experimental results show that both the latest multicast and broadcast routing algorithms used in SHN are faster than the original SFTP routing algorithms.
Thara Angskun, Graham E. Fagg, George Bosilca, Jel
Added 25 Jan 2011
Updated 25 Jan 2011
Type Journal
Year 2010
Where FGCS
Authors Thara Angskun, Graham E. Fagg, George Bosilca, Jelena Pjesivac-Grbovic, Jack Dongarra
Comments (0)