Sciweavers

282 search results - page 15 / 57
» Reliability and Scheduling on Systems Subject to Failures
Sort
View
CODES
2011
IEEE
13 years 11 months ago
Analysis and optimization of fault-tolerant task scheduling on multiprocessor embedded systems
Reliability is a major requirement for most safety-related systems. To meet this requirement, fault-tolerant techniques such as hardware replication and software re-execution are ...
Jia Huang, Jan Olaf Blech, Andreas Raabe, Christia...
HPDC
2012
IEEE
13 years 2 months ago
Understanding the effects and implications of compute node related failures in hadoop
Hadoop has become a critical component in today’s cloud environment. Ensuring good performance for Hadoop is paramount for the wide-range of applications built on top of it. In ...
Florin Dinu, T. S. Eugene Ng
EMSOFT
2005
Springer
15 years 5 months ago
Random testing of interrupt-driven software
Interrupt-driven embedded software is hard to thoroughly test since it usually contains a very large number of executable paths. Developers can test more of these paths using rand...
John Regehr
ASIAN
2003
Springer
111views Algorithms» more  ASIAN 2003»
15 years 5 months ago
Unreliable Failure Detectors via Operational Semantics
Abstract. The concept of unreliable failure detectors for reliable distributed systems was introduced by Chandra and Toueg as a fine-grained means to add weak forms of synchrony i...
Uwe Nestmann, Rachele Fuzzati
IPSN
2010
Springer
15 years 6 months ago
Diagnostic powertracing for sensor node failure analysis
Troubleshooting unresponsive sensor nodes is a significant challenge in remote sensor network deployments. This paper introduces the tele-diagnostic powertracer, an in-situ troub...
Mohammad Maifi Hasan Khan, Hieu Khac Le, Michael L...