Sciweavers

242 search results - page 14 / 49
» Computing bounds for fault tolerance using formal techniques
Sort
View
ICS
2007
Tsinghua U.
15 years 3 months ago
Proactive fault tolerance for HPC with Xen virtualization
Large-scale parallel computing is relying increasingly on clusters with thousands of processors. At such large counts of compute nodes, faults are becoming common place. Current t...
Arun Babu Nagarajan, Frank Mueller, Christian Enge...
USENIX
2008
14 years 12 months ago
Improving Scalability and Fault Tolerance in an Application Management Infrastructure
This paper explores the challenges associated with distributed application management in large-scale computing environments. In particular, we investigate several techniques for e...
Nikolay Topilski, Jeannie R. Albrecht, Amin Vahdat
82
Voted
CCGRID
2001
IEEE
15 years 1 months ago
Sabotage-Tolerance Mechanisms for Volunteer Computing Systems
In this paper, we address the new problem of protecting volunteer computing systems from malicious volunteers who submit erroneous results by presenting sabotagetolerance mechanis...
Luis F. G. Sarmenta
ICDCS
1990
IEEE
15 years 1 months ago
Implementing Fault-Tolerant Distributed Applications
This paper develops some control structures suitable for composing fault-tolerant distrib uted applications using atomic actions (atomic transactions) as building blocks, and then...
Santosh K. Shrivastava, Stuart M. Wheater
ICPP
2000
IEEE
15 years 1 months ago
A Problem-Specific Fault-Tolerance Mechanism for Asynchronous, Distributed Systems
The idle computers on a local area, campus area, or even wide area network represent a significant computational resource--one that is, however, also unreliable, heterogeneous, an...
Adriana Iamnitchi, Ian T. Foster