Sciweavers

19 search results - page 2 / 4
» Distributed Synthesis of Fault-Tolerant Programs in the High...
Sort
View
IPPS
1998
IEEE
13 years 9 months ago
Affordable Fault Tolerance Through Adaptation
Fault-tolerant programs are typically not only difficult to implement but also incur extra costs in terms of performance or resource consumption. Failures are typically relatively ...
Ilwoo Chang, Matti A. Hiltunen, Richard D. Schlich...
CLUSTER
2003
IEEE
13 years 10 months ago
Coordinated Checkpoint versus Message Log for Fault Tolerant MPI
— Large Clusters, high availability clusters and Grid deployments often suffer from network, node or operating system faults and thus require the use of fault tolerant programmin...
Aurelien Bouteiller, Pierre Lemarinier, Gér...
CCGRID
2006
IEEE
13 years 11 months ago
Proposal of MPI Operation Level Checkpoint/Rollback and One Implementation
With the increasing number of processors in modern HPC(High Performance Computing) systems, there are two emergent problems to solve. One is scalability, the other is fault tolera...
Yuan Tang, Graham E. Fagg, Jack Dongarra
EURONGI
2006
Springer
13 years 9 months ago
Randomized Self-stabilizing Algorithms for Wireless Sensor Networks
Wireless sensor networks (WSNs) pose challenges not present in classical distributed systems: resource limitations, high failure rates, and ad hoc deployment. The lossy nature of w...
Volker Turau, Christoph Weyer
PADS
2006
ACM
13 years 11 months ago
Aurora: An Approach to High Throughput Parallel Simulation
A master/worker paradigm for executing large-scale parallel discrete event simulation programs over networkenabled computational resources is proposed and evaluated. In contrast t...
Alfred Park, Richard M. Fujimoto