Abstract—Distributed storage systems provide large-scale reliable data storage by storing a certain degree of redundancy in a decentralized fashion on a group of storage nodes. T...
Aggressive technology scaling over the years has helped improve processor performance but has caused a reduction in processor reliability. Shrinking transistor sizes and lower sup...
The use of several distinct recovery procedures is one of the techniques that can be used to ensure high availability and fault-tolerance of computer systems. This method has been...
Sergiy A. Vilkomir, David Lorge Parnas, Veena B. M...
Transient faults are emerging as a critical concern in the reliability of general-purpose microprocessors. As architectural trends point towards multi-threaded multi-core designs,...
Alex Shye, Tipp Moseley, Vijay Janapa Reddi, Josep...
In this paper, we propose SiLo, a novel energy efficient shifted logging storage architecture, for write-oriented workloads. By organizing free storage space of redundant mirrored...