Sciweavers

22 search results - page 5 / 5
» Using replication and checkpointing for reliable task manage...
Sort
View
CCGRID
2009
IEEE
13 years 9 months ago
Failure-Aware Construction and Reconfiguration of Distributed Virtual Machines for High Availability Computing
In large-scale clusters and computational grids, component failures become norms instead of exceptions. Failure occurrence as well as its impact on system performance and operatio...
Song Fu
MIDDLEWARE
2007
Springer
13 years 11 months ago
AVMEM - Availability-Aware Overlays for Management Operations in Non-cooperative Distributed Systems
Monitoring and management operations that query nodes based on their availability can be extremely useful in a variety of largescale distributed systems containing hundreds to thou...
Ramsés Morales, Brian Cho, Indranil Gupta