When failures occur in Internet overlay connections today, it is difficult for users to determine the root cause of failure. An overlay connection may require TCP connections bet...
Abstract. An important step in achieving robustness to run-time faults is the ability to detect and repair problems when they arise in a running system. Effective fault detection a...
Paulo Casanova, Bradley R. Schmerl, David Garlan, ...
There is an increasing demand for highly reliable systems in the safety conscious climate of today’s world. When a fault does occur there are two desirable outcomes. Firstly, de...
Continued technology scaling is resulting in systems with billions of devices. Unfortunately, these devices are prone to failures from various sources, resulting in even commodity...
We develop a microprocessor design that tolerates hard faults, including fabrication defects and in-field faults, by leveraging existing microprocessor redundancy. To do this, we...