Sciweavers

116 search results - page 2 / 24
» A Communication Framework for Fault-Tolerant Parallel Execut...
Sort
View
IPPS
2007
IEEE
13 years 11 months ago
A Framework for Experimental Validation and Performance Evaluation in Fault Tolerant Distributed System
Performing experimental evaluation of fault tolerant distributed systems is a complex and tedious task, and automating as much as possible of the execution and evaluation of exper...
Hein Meling
HPDC
1999
IEEE
13 years 9 months ago
Starfish: Fault-Tolerant Dynamic MPI Programs on Clusters of Workstations
This paper reports on the architecture and design of Starfish, an environment for executing dynamic (and static) MPI-2 programs on a cluster of workstations. Starfish is unique in ...
Adnan Agbaria, Roy Friedman
ICDCS
1995
IEEE
13 years 9 months ago
Parallel Processing on Networks of Workstations: A Fault-Tolerant, High Performance Approach
One of the mostsoughtaftersoftware innovation of thisdecade is the construction of systems using off-the-shelf workstations that actually deliver, and even surpass, the power and ...
Partha Dasgupta, Zvi M. Kedem, Michael O. Rabin
HASE
1997
IEEE
13 years 9 months ago
High-Coverage Fault Tolerance in Real-Time Systems Based on Point-to-Point Communication
: The distributed recovery block (DRB) scheme is a widely applicable approach for realizing both hardware and software fault tolerance in real-time distributed and parallel compute...
K. H. Kim, Chittur Subbaraman, Eltefaat Shokri
GRID
2006
Springer
13 years 5 months ago
Implementation of Fault-Tolerant GridRPC Applications
In this paper, a task parallel application is implemented with Ninf-G which is a GridRPC system, and experimented on, using the Grid testbed in Asia Pacific, for three months. The...
Yusuke Tanimura, Tsutomu Ikegami, Hidemoto Nakada,...