Sciweavers

DSOM
2004
Springer

Failure Recovery in Distributed Environments with Advance Reservation Management Systems

13 years 10 months ago
Failure Recovery in Distributed Environments with Advance Reservation Management Systems
Resource reservations in advance are a mature concept for the allocation of various resources, particularly in grid environments. Common grid toolkits such as Globus support advance reservations and assign jobs to resources at admission time. While the allocation mechanisms for advance reservations are available in current grid management systems, in case of failures the advance reservation perspective demands for strategies that support more than recovery of jobs or applications that are active at the time the resource failure occurs. Instead, also already admitted, but not yet started applications are affected by the failure and hence, need to be dealt with in an appropriate manner. In this paper, we discuss the properties of advance reservations with respect to failure recovery and outline a number of strategies applicable in such cases in order to reduce the impact of resource failures and outages. It can be shown that it pays to remap also affected but not yet started jobs to al...
Lars-Olof Burchard, Barry Linnert
Added 01 Jul 2010
Updated 01 Jul 2010
Type Conference
Year 2004
Where DSOM
Authors Lars-Olof Burchard, Barry Linnert
Comments (0)