Sciweavers

139 search results - page 1 / 28
» Software Fault Tolerance of Distributed Programs Using Compu...
Sort
View
ICDCS
2003
IEEE
13 years 9 months ago
Software Fault Tolerance of Distributed Programs Using Computation Slicing
Writing correct distributed programs is hard. In spite of extensive testing and debugging, software faults persist even in commercial grade software. Many distributed systems, esp...
Neeraj Mittal, Vijay K. Garg
ISORC
2009
IEEE
13 years 11 months ago
Fault-Tolerance for Component-Based Systems - An Automated Middleware Specialization Approach
General-purpose middleware, by definition, cannot readily support domain-specific semantics without significant manual efforts in specializing the middleware. This paper prese...
Sumant Tambe, Akshay Dabholkar, Aniruddha S. Gokha...
CLUSTER
2006
IEEE
13 years 10 months ago
FAIL-MPI: How Fault-Tolerant Is Fault-Tolerant MPI?
One of the topics of paramount importance in the development of Cluster and Grid middleware is the impact of faults since their occurrence in Grid infrastructures and in large-sca...
William Hoarau, Pierre Lemarinier, Thomas Hé...
IPPS
1999
IEEE
13 years 8 months ago
An Adaptive, Fault-Tolerant Implementation of BSP for JAVA-Based Volunteer Computing Systems
Abstract. In recent years, there has been a surge of interest in Javabased volunteer computing systems, which aim to make it possible to build very large parallel computing network...
Luis F. G. Sarmenta
SIGSOFT
2007
ACM
14 years 5 months ago
Efficient checkpointing of java software using context-sensitive capture and replay
Checkpointing and replaying is an attractive technique that has been used widely at the operating/runtime system level to provide fault tolerance. Applying such a technique at the...
Guoqing Xu, Atanas Rountev, Yan Tang, Feng Qin