Sciweavers

170 search results - page 21 / 34
» Interactive Animation of Fault Tolerant Parallel Algorithms
Sort
View
95
Voted
ICS
2004
Tsinghua U.
15 years 5 months ago
Adaptive incremental checkpointing for massively parallel systems
Given the scale of massively parallel systems, occurrence of faults is no longer an exception but a regular event. Periodic checkpointing is becoming increasingly important in the...
Saurabh Agarwal, Rahul Garg, Meeta Sharma Gupta, J...
128
Voted
HPDC
2009
IEEE
15 years 6 months ago
Interconnect agnostic checkpoint/restart in open MPI
Long running High Performance Computing (HPC) applications at scale must be able to tolerate inevitable faults if they are to harness current and future HPC systems. Message Passi...
Joshua Hursey, Timothy Mattox, Andrew Lumsdaine
LCN
2008
IEEE
15 years 6 months ago
Constructing low-latency overlay networks: Tree vs. mesh algorithms
Abstract—Distributed interactive applications may have stringent latency requirements and dynamic user groups. These applications may benefit from a group communication system, ...
Knut-Helge Vik, Carsten Griwodz, Pål Halvors...
ICPADS
2006
IEEE
15 years 5 months ago
Fast Convergence in Self-Stabilizing Wireless Networks
The advent of large scale multi-hop wireless networks highlights problems of fault tolerance and scale in distributed system, motivating designs that autonomously recover from tra...
Nathalie Mitton, Eric Fleury, Isabelle Guér...
SPAA
2003
ACM
15 years 5 months ago
The complexity of verifying memory coherence
The general problem of verifying coherence for shared-memory multiprocessor executions is NP-Complete. Verifying memory consistency models is therefore NP-Hard, because memory con...
Jason F. Cantin, Mikko H. Lipasti, James E. Smith