Sciweavers

453 search results - page 9 / 91
» Fault-Tolerant Techniques for Ambient Intelligent Distribute...
Sort
View
SOSP
2001
ACM
15 years 10 months ago
BASE: Using Abstraction to Improve Fault Tolerance
ing Abstraction to Improve Fault Tolerance MIGUEL CASTRO Microsoft Research and RODRIGO RODRIGUES and BARBARA LISKOV MIT Laboratory for Computer Science Software errors are a major...
Rodrigo Rodrigues, Miguel Castro, Barbara Liskov
IPPS
2007
IEEE
15 years 8 months ago
The Design and Implementation of Checkpoint/Restart Process Fault Tolerance for Open MPI
To be able to fully exploit ever larger computing platforms, modern HPC applications and system software must be able to tolerate inevitable faults. Historically, MPI implementati...
Joshua Hursey, Jeffrey M. Squyres, Timothy Mattox,...
IPPS
2003
IEEE
15 years 7 months ago
A Low Cost Fault Tolerant Packet Routing for Parallel Computers
This work presents a new switching mechanism to tolerate arbitrary faults in interconnection networks with a negligible implementation cost. Although our routing technique can be ...
Valentin Puente, José A. Gregorio, Ram&oacu...
ISORC
2007
IEEE
15 years 8 months ago
Exploiting Tuple Spaces to Provide Fault-Tolerant Scheduling on Computational Grids
Scheduling tasks on large-scale computational grids is difficult due to the heterogeneous computational capabilities of the resources, node unavailability and unreliable network ...
Fábio Favarim, Joni da Silva Fraga, Lau Che...
CCGRID
2008
IEEE
15 years 8 months ago
A Technique for Lock-Less Mirroring in Parallel File Systems
—As parallel file systems span larger and larger numbers of nodes in order to provide the performance and scalability necessary for modern cluster applications, the need for fau...
Bradley W. Settlemyer, Walter B. Ligon III