Sciweavers

CLUSTER
2007
IEEE

Evaluation of fault-tolerant policies using simulation

13 years 10 months ago
Evaluation of fault-tolerant policies using simulation
— Various mechanisms for fault-tolerance (FT) are used today in order to reduce the impact of failures on application execution. In the case of system failure, standard FT mechanisms are checkpoint/restart (for reactive FT) and migration (for pro-active FT). However, each of these mechanisms create an overhead on application execution, overhead that for instance becomes critical on large-scale systems where previous studies have shown that applications may spend more time checkpointing state than performing useful work. In order to decrease this overhead, researchers try to both optimize existing FT mechanisms and implement new FT policies. For instance, combining reactive and pro-active approaches in order to decrease the number of checkpoints that must be performed during the application’s execution. However, currently no solutions exist which enable the evaluation of these FT approaches through simulation, instead experimentations must be done using real platforms. This increase...
Anand Tikotekar, Geoffroy Vallée, Thomas Na
Added 02 Jun 2010
Updated 02 Jun 2010
Type Conference
Year 2007
Where CLUSTER
Authors Anand Tikotekar, Geoffroy Vallée, Thomas Naughton, Stephen L. Scott, Chokchai Leangsuksun
Comments (0)