Sciweavers

402 search results - page 3 / 81
» Fault-tolerance in the Borealis distributed stream processin...
Sort
View
PVM
2007
Springer
14 years 11 days ago
Using CMT in SCTP-Based MPI to Exploit Multiple Interfaces in Cluster Nodes
Many existing clusters use inexpensive Gigabit Ethernet and often have multiple interfaces cards to improve bandwidth and enhance fault tolerance. We investigate the use of Concurr...
Brad Penoff, Mike Tsai, Janardhan R. Iyengar, Alan...
PRDC
2005
IEEE
13 years 12 months ago
Sigma: A Fault-Tolerant Mutual Exclusion Algorithm in Dynamic Distributed Systems Subject to Process Crashes and Memory Losses
This paper introduces the Sigma algorithm that solves fault-tolerant mutual exclusion problem in dynamic systems where the set of processes may be large and change dynamically, pr...
Wei Chen, Shiding Lin, Qiao Lian, Zheng Zhang
CLUSTER
2011
IEEE
12 years 6 months ago
Dynamic Load Balance for Optimized Message Logging in Fault Tolerant HPC Applications
—Computing systems will grow significantly larger in the near future to satisfy the needs of computational scientists in areas like climate modeling, biophysics and cosmology. S...
Esteban Meneses, Laxmikant V. Kalé, Greg Br...
ICS
2011
Tsinghua U.
12 years 9 months ago
High performance linpack benchmark: a fault tolerant implementation without checkpointing
The probability that a failure will occur before the end of the computation increases as the number of processors used in a high performance computing application increases. For l...
Teresa Davies, Christer Karlsson, Hui Liu, Chong D...
DSN
2003
IEEE
13 years 11 months ago
Comparison of Failure Detectors and Group Membership: Performance Study of Two Atomic Broadcast Algorithms
Protocols that solve agreement problems are essential building blocks for fault tolerant distributed systems. While many protocols have been published, little has been done to ana...
Péter Urbán, Ilya Shnayderman, Andr&...