Sciweavers

453 search results - page 16 / 91
» Fault-Tolerant Techniques for Ambient Intelligent Distribute...
Sort
View
HPDC
2000
IEEE
15 years 6 months ago
Distributed Processor Allocation in Large PC Clusters
Current processor allocation techniques for highly parallel systems are based on centralized front-end based algorithms. As a result, the applied strategies are restricted to stat...
Hans-Ulrich Heiss, César A. F. De Rose, Phi...
ISPA
2004
Springer
15 years 7 months ago
Highly Reliable Linux HPC Clusters: Self-Awareness Approach
Abstract. Current solutions for fault-tolerance in HPC systems focus on dealing with the result of a failure. However, most are unable to handle runtime system configuration change...
Chokchai Leangsuksun, Tong Liu, Yudan Liu, Stephen...
SIGSOFT
2008
ACM
16 years 2 months ago
Experimenting with exception propagation mechanisms in service-oriented architecture
Exception handling is one of the popular means used for improving dependability and supporting recovery in the ServiceOriented Architecture (SOA). This practical experience paper ...
Anatoliy Gorbenko, Alexander Romanovsky, Vyachesla...
117
Voted
SPAA
2010
ACM
15 years 6 months ago
Securing every bit: authenticated broadcast in radio networks
This paper studies non-cryptographic authenticated broadcast in radio networks subject to malicious failures. We introduce two protocols that address this problem. The first, Nei...
Dan Alistarh, Seth Gilbert, Rachid Guerraoui, Zark...
ICDCS
2012
IEEE
13 years 4 months ago
Combining Partial Redundancy and Checkpointing for HPC
Today’s largest High Performance Computing (HPC) systems exceed one Petaflops (1015 floating point operations per second) and exascale systems are projected within seven years...
James Elliott, Kishor Kharbas, David Fiala, Frank ...