-- A hardware fault tolerance scheme for large multicomputers executing time-consuming non-interactive applications is described. Error detection and recovery are done mostly by so...
Abstract—System availability is difficult for systems to maintain in the face of Internet worms. Large systems have vulnerabilities, and if a system attempts to continue operati...
Daniela A. S. de Oliveira, Jedidiah R. Crandall, G...
Data corruption is one of the key problems that is on top of the radar screen of most CIOs. Continuous Data Protection (CDP) technologies help enterprises deal with data corruptio...
Abstract--Malicious and misconfigured nodes can inject incorrect state into a distributed system, which can then be propagated system-wide as a result of normal network operation. ...
Daniel Gyllstrom, Sudarshan Vasudevan, Jim Kurose,...
As technology scales ever further, device unreliability is creating excessive complexity for hardware to maintain the illusion of perfect operation. In this paper, we consider whe...
Marc de Kruijf, Shuou Nomura, Karthikeyan Sankaral...