Sciweavers

149 search results - page 3 / 30
» The Performance of Coordinated and Independent Checkpointing
Sort
View
DAIS
2006
14 years 11 months ago
Using Speculative Push for Unnecessary Checkpoint Creation Avoidance
Abstract. This paper discusses a way of incorporating speculation techniques into Distributed Shared Memory (DSM) systems with checkpointing mechanism without creating unnecessary ...
Arkadiusz Danilecki, Michal Szychowiak
FGCS
2008
140views more  FGCS 2008»
14 years 9 months ago
Blocking vs. non-blocking coordinated checkpointing for large-scale fault tolerant MPI Protocols
A long-term trend in high-performance computing is the increasing number of nodes in parallel computing platforms, which entails a higher failure probability. Fault tolerant progr...
Darius Buntinas, Camille Coti, Thomas Hérau...
ISCA
2011
IEEE
238views Hardware» more  ISCA 2011»
14 years 1 months ago
Rebound: scalable checkpointing for coherent shared memory
As we move to large manycores, the hardware-based global checkpointing schemes that have been proposed for small shared-memory machines do not scale. Scalability barriers include ...
Rishi Agarwal, Pranav Garg, Josep Torrellas
PPAM
2005
Springer
15 years 3 months ago
Checkpointing Speculative Distributed Shared Memory
This paper describes a checkpointing mechanism destined for Distributed Shared Memory (DSM) systems with speculative prefetching. Speculation is a general technique involving predi...
Arkadiusz Danilecki, Anna Kobusinska, Michal Szych...
ISCA
2002
IEEE
115views Hardware» more  ISCA 2002»
15 years 2 months ago
SafetyNet: Improving the Availability of Shared Memory Multiprocessors with Global Checkpoint/Recovery
We develop an availability solution, called SafetyNet, that uses a unified, lightweight checkpoint/recovery mechanism to support multiple long-latency fault detection schemes. At...
Daniel J. Sorin, Milo M. K. Martin, Mark D. Hill, ...