Sciweavers

175 search results - page 3 / 35
» Scalable Fault-Tolerant Distributed Shared Memory
Sort
View
DNA
2008
Springer
149views Bioinformatics» more  DNA 2008»
13 years 8 months ago
Connecting the Dots: Molecular Machinery for Distributed Robotics
Abstract. Nature is considered one promising area to search for inspiration in designing robotic systems. Some work in swarm robotics has tried to build systems that resemble distr...
Yuriy Brun, Dustin Reishus
CLUSTER
2004
IEEE
13 years 10 months ago
FTC-Charm++: an in-memory checkpoint-based fault tolerant runtime for Charm++ and MPI
As high performance clusters continue to grow in size, the mean time between failure shrinks. Thus, the issues of fault tolerance and reliability are becoming one of the challengi...
Gengbin Zheng, Lixia Shi, Laxmikant V. Kalé
USENIX
1996
13 years 7 months ago
Transparent Fault Tolerance for Parallel Applications on Networks of Workstations
This paper describes a new method for providingtransparent fault tolerance for parallel applications on a network of workstations. We have designed our method in the context of sh...
Daniel J. Scales, Monica S. Lam
ICS
2011
Tsinghua U.
12 years 9 months ago
High performance linpack benchmark: a fault tolerant implementation without checkpointing
The probability that a failure will occur before the end of the computation increases as the number of processors used in a high performance computing application increases. For l...
Teresa Davies, Christer Karlsson, Hui Liu, Chong D...
IPPS
2006
IEEE
14 years 13 days ago
Algorithm-based checkpoint-free fault tolerance for parallel matrix computations on volatile resources
As the desire of scientists to perform ever larger computations drives the size of today’s high performance computers from hundreds, to thousands, and even tens of thousands of ...
Zizhong Chen, Jack Dongarra