Sciweavers

ECOWS
2010
Springer

Shepherd: node monitors for fault-tolerant distributed process execution in OSIRIS

13 years 2 months ago
Shepherd: node monitors for fault-tolerant distributed process execution in OSIRIS
OSIRIS is a middleware for the composition and orchestration of distributed web services that follows a P2P decentralized approach to process execution, providing already some degree of resilience to faults and high performance in large-scale computational clusters. In this paper, we present on-going work aimed at improving OSIRIS' fault tolerance capabilities. We introduce in OSIRIS new architectural elements for the maintenance of a virtual stable storage and the monitoring of activities of service instances, together with algorithms that allow execution to survive also failures that the system is currently not able to cope with. Categories and Subject Descriptors C.2.4 [Distributed Systems]: Distributed applications, cloud computing, grid computing General Terms Algorithms, Reliability Keywords Decentralized process execution, fault-tolerance, OSIRIS, monitoring, DHT
Diego Milano, Nenad Stojnic
Added 10 Feb 2011
Updated 10 Feb 2011
Type Journal
Year 2010
Where ECOWS
Authors Diego Milano, Nenad Stojnic
Comments (0)