Sciweavers

149 search results - page 2 / 30
» The Performance of Coordinated and Independent Checkpointing
Sort
View
SRDS
1994
IEEE
13 years 9 months ago
Coordinated Checkpointing-Rollback Error Recovery for Distributed Shared Memory Multicomputers
Most recovery schemes that have been proposed for Distributed Shared Memory (DSM) systems require unnecessarily high checkpointing frequency and checkpoint traffic, which are sens...
G. Janakiraman, Yuval Tamir
WAIM
2004
Springer
13 years 10 months ago
A Low-Cost Checkpointing Scheme for Mobile Computing Systems
In distributed computing systems, processes in different hosts take checkpoints to survive failures. For mobile computing systems, due to certain new characteristics conventional d...
Guohui Li, Hongya Wang, Jixiong Chen
ICDCS
2000
IEEE
13 years 9 months ago
Coherence-based Coordinated Checkpointing for Software Distributed Shared Memory Systems
Fault-tolerant techniques that can cope with system failures in software distributed shared memory (SDSM) are essential for creating productive and highly available parallel compu...
Angkul Kongmunvattana, Santipong Tanchatchawal, Ni...
CLUSTER
2004
IEEE
13 years 8 months ago
Improved message logging versus improved coordinated checkpointing for fault tolerant MPI
Fault tolerance is a very important concern for critical high performance applications using the MPI library. Several protocols provide automatic and transparent fault detection a...
Pierre Lemarinier, Aurelien Bouteiller, Thomas H&e...
IPPS
2007
IEEE
13 years 11 months ago
Improving MPI Independent Write Performance Using A Two-Stage Write-Behind Buffering Method
Many large-scale production applications often have very long executions times and require periodic data checkpoints in order to save the state of the computation for program rest...
Wei-keng Liao, Avery Ching, Kenin Coloma, Alok N. ...