Sciweavers

IJHPCA
2006
117views more  IJHPCA 2006»
13 years 5 months ago
MPICH-V Project: A Multiprotocol Automatic Fault-Tolerant MPI
Abstract-- High performance computing platforms like Clusters, Grid and Desktop Grids are becoming larger and subject to more frequent failures. MPI is one of the most used message...
Aurelien Bouteiller, Thomas Hérault, G&eacu...

Publication
110views
13 years 5 months ago
An adaptive QoS-aware fault tolerance strategy for web services
Service-Oriented Architecture (SOA) is widely adopted for building mission-critical systems, ranging from on-line stores to complex airline management systems. How to build reliabl...
CORR
2008
Springer
134views Education» more  CORR 2008»
13 years 5 months ago
Algorithmic Based Fault Tolerance Applied to High Performance Computing
: We present a new approach to fault tolerance for High Performance Computing system. Our approach is based on a careful adaptation of the Algorithmic Based Fault Tolerance techniq...
George Bosilca, Remi Delmas, Jack Dongarra, Julien...
CORR
2010
Springer
94views Education» more  CORR 2010»
13 years 5 months ago
Unidirectional Error Correcting Codes for Memory Systems: A Comparative Study
In order to achieve fault tolerance, highly reliable system often require the ability to detect errors as soon as they occur and prevent the speared of erroneous information throu...
Muzhir Al-Ani, Qeethara Al-Shayea
ECRTS
2008
IEEE
13 years 5 months ago
ORTEGA: An Efficient and Flexible Software Fault Tolerance Architecture for Real-Time Control Systems
Fault tolerance is an important aspect in real-time computing. In real-time control systems, tasks could be faulty due to various reasons. Faulty tasks may compromise the performa...
Xue Liu, Hui Ding, Kihwal Lee, Qixin Wang, Lui Sha
DSD
2010
IEEE
140views Hardware» more  DSD 2010»
13 years 5 months ago
RobuCheck: A Robustness Checker for Digital Circuits
Abstract—Continuously shrinking feature sizes cause an increasing vulnerability of digital circuits. Manufacturing failures and transient faults may tamper the functionality. Aut...
Stefan Frehse, Görschwin Fey, André S&...
USENIX
1996
13 years 6 months ago
Transparent Fault Tolerance for Parallel Applications on Networks of Workstations
This paper describes a new method for providingtransparent fault tolerance for parallel applications on a network of workstations. We have designed our method in the context of sh...
Daniel J. Scales, Monica S. Lam
PDPTA
2000
13 years 6 months ago
Evaluation of Integrated Error Processing and Fault Diagnosis in Multiprocessor Systems
This paper deals with multiprocessor systems required to provide both high performance and good figures of dependability attributes. Fault tolerance is pursued through a proper co...
Felicita Di Giandomenico, Silvano Chiaradonna, And...
WISES
2003
13 years 6 months ago
Built-In Fault Injectors - The Logical Continuation of BIST?
— With the increasing number of embedded computer systems being used in safety critical applications the testing and assessment of a system’s fault tolerance properties become ...
Andreas Steininger, Babak Rahbaran, Thomas Handl
ESANN
2001
13 years 6 months ago
Learning fault-tolerance in Radial Basis Function Networks
This paper describes a method of supervised learning based on forward selection branching. This method improves fault tolerance by means of combining information related to general...
Xavier Parra, Andreu Català