Sciweavers

260 search results - page 27 / 52
» Reliable fault-tolerant sensors for distributed systems
Sort
View
122
Voted
HPDC
2008
IEEE
15 years 8 months ago
DataLab: transactional data-parallel computing on an active storage cloud
Active storage clouds are an attractive platform for executing large data intensive workloads found in many fields of science. However, active storage presents new system managem...
Brandon Rich, Douglas Thain
ICPP
2007
IEEE
15 years 8 months ago
Fault-Driven Re-Scheduling For Improving System-level Fault Resilience
The productivity of HPC system is determined not only by their performance, but also by their reliability. The conventional method to limit the impact of failures is checkpointing...
Yawei Li, Prashasta Gujrati, Zhiling Lan, Xian-He ...
ICDCS
2000
IEEE
15 years 6 months ago
On Low-Cost Error Containment and Recovery Methods for Guarded Software Upgrading
To assure dependable onboard evolution, we have developed a methodology called guarded software upgrading (GSU). In this paper, we focus on a low-cost approach to error containmen...
Ann T. Tai, Kam S. Tso, Leon Alkalai, Savio N. Cha...
EMSOFT
2009
Springer
15 years 8 months ago
Adding aggressive error correction to a high-performance compressing flash file system
While NAND flash memories have rapidly increased in both capacity and performance and are increasingly used as a storage device in many embedded systems, their reliability has de...
Yangwook Kang, Ethan L. Miller
139
Voted
CODES
2011
IEEE
14 years 1 months ago
Analysis and optimization of fault-tolerant task scheduling on multiprocessor embedded systems
Reliability is a major requirement for most safety-related systems. To meet this requirement, fault-tolerant techniques such as hardware replication and software re-execution are ...
Jia Huang, Jan Olaf Blech, Andreas Raabe, Christia...