Transient faults are emerging as a critical concern in the reliability of general-purpose microprocessors. As architectural trends point towards multi-threaded multi-core designs,...
Alex Shye, Tipp Moseley, Vijay Janapa Reddi, Josep...
This paper presents a novel system architecture applicable to high-performance and flexible transport data processing which includes complex protocol operation and a network contr...
We present a replication-based approach to fault-tolerant distributed stream processing in the face of node failures, network failures, and network partitions. Our approach aims t...
Magdalena Balazinska, Hari Balakrishnan, Samuel Ma...
With the heavy reliance of modern scientific applications upon the MPI Standard, it has become critical for the implementation of MPI to be as capable and as fast as possible. Th...
Keith D. Underwood, K. Scott Hemmert, Arun Rodrigu...
This paper reports work to support dependability arguments about the future reliability of a product before there is direct empirical evidence. We develop a method for estimating ...