Sciweavers

149 search results - page 3 / 30
» The Performance of Coordinated and Independent Checkpointing
Sort
View
DAIS
2006
13 years 6 months ago
Using Speculative Push for Unnecessary Checkpoint Creation Avoidance
Abstract. This paper discusses a way of incorporating speculation techniques into Distributed Shared Memory (DSM) systems with checkpointing mechanism without creating unnecessary ...
Arkadiusz Danilecki, Michal Szychowiak
FGCS
2008
140views more  FGCS 2008»
13 years 5 months ago
Blocking vs. non-blocking coordinated checkpointing for large-scale fault tolerant MPI Protocols
A long-term trend in high-performance computing is the increasing number of nodes in parallel computing platforms, which entails a higher failure probability. Fault tolerant progr...
Darius Buntinas, Camille Coti, Thomas Hérau...
ISCA
2011
IEEE
238views Hardware» more  ISCA 2011»
12 years 9 months ago
Rebound: scalable checkpointing for coherent shared memory
As we move to large manycores, the hardware-based global checkpointing schemes that have been proposed for small shared-memory machines do not scale. Scalability barriers include ...
Rishi Agarwal, Pranav Garg, Josep Torrellas
PPAM
2005
Springer
13 years 10 months ago
Checkpointing Speculative Distributed Shared Memory
This paper describes a checkpointing mechanism destined for Distributed Shared Memory (DSM) systems with speculative prefetching. Speculation is a general technique involving predi...
Arkadiusz Danilecki, Anna Kobusinska, Michal Szych...
ISCA
2002
IEEE
115views Hardware» more  ISCA 2002»
13 years 10 months ago
SafetyNet: Improving the Availability of Shared Memory Multiprocessors with Global Checkpoint/Recovery
We develop an availability solution, called SafetyNet, that uses a unified, lightweight checkpoint/recovery mechanism to support multiple long-latency fault detection schemes. At...
Daniel J. Sorin, Milo M. K. Martin, Mark D. Hill, ...