Sciweavers

200 search results - page 16 / 40
» Design Time Reliability Analysis of Distributed Fault Tolera...
Sort
View
SPAA
2003
ACM
15 years 4 months ago
The complexity of verifying memory coherence
The general problem of verifying coherence for shared-memory multiprocessor executions is NP-Complete. Verifying memory consistency models is therefore NP-Hard, because memory con...
Jason F. Cantin, Mikko H. Lipasti, James E. Smith
ISCA
2011
IEEE
270views Hardware» more  ISCA 2011»
14 years 3 months ago
Sampling + DMR: practical and low-overhead permanent fault detection
With technology scaling, manufacture-time and in-field permanent faults are becoming a fundamental problem. Multi-core architectures with spares can tolerate them by detecting an...
Shuou Nomura, Matthew D. Sinclair, Chen-Han Ho, Ve...
IEEEHPCS
2010
14 years 9 months ago
Using replication and checkpointing for reliable task management in computational Grids
In grid computing systems, providing fault-tolerance is required for both scientific computation and file-sharing to increase their reliability. In previous works, several mechani...
Sangho Yi, Derrick Kondo, Bongjae Kim, Geunyoung P...
IPPS
2010
IEEE
14 years 9 months ago
Optimizing RAID for long term data archives
We present new methods to extend data reliability of disks in RAID systems for applications like long term data archival. The proposed solutions extend existing algorithms to detec...
Henning Klein, Jörg Keller
DAC
2008
ACM
16 years 19 days ago
Study of the effects of MBUs on the reliability of a 150 nm SRAM device
1 Soft errors induced by radiation are an increasing problem in the microelectronic field. Although traditional models estimate the reliability of memories suffering Single Event U...
Juan Antonio Maestro, Pedro Reviriego