Sciweavers

1038 search results - page 75 / 208
» Distributed Fault Tolerant Controllers
Sort
View
ISADS
1999
IEEE
15 years 2 months ago
Fault Tolerance in Decentralized Systems
: In a decentralised system the problems of fault tolerance, and in particular error recovery, vary greatly depending on the design assumptions. For example, in a distributed datab...
Brian Randell
ICDCS
1995
IEEE
15 years 1 months ago
Newtop: A Fault-Tolerant Group Communication Protocol
: A general purpose group communication protocol suite called Newtop is described. It is assumed that processes can simultaneously belong to many groups, group size could be large,...
Paul D. Ezhilchelvan, Raimundo A. Macêdo, Sa...
CCGRID
2008
IEEE
14 years 10 months ago
Fault Tolerance and Recovery of Scientific Workflows on Computational Grids
In this paper, we describe the design and implementation of two mechanisms for fault-tolerance and recovery for complex scientific workflows on computational grids. We present our ...
Gopi Kandaswamy, Anirban Mandal, Daniel A. Reed
HCW
1998
IEEE
15 years 2 months ago
CCS Resource Management in Networked HPC Systems
CCS is a resource management system for parallel high-performance computers. At the user level, CCS provides vendor-independent access to parallel systems. At the system administr...
Axel Keller, Alexander Reinefeld
HPCA
1996
IEEE
15 years 2 months ago
Fault-Tolerance with Multimodule Routers
The current multiprocessors such asCray T3D support interprocessor communication using partitioned dimension-order routers (PDRs). In a PDR implementation, the routing logic and sw...
Suresh Chalasani, Rajendra V. Boppana