Failure Resilience for Device Drivers

9 years 24 days ago
Failure Resilience for Device Drivers
Studies have shown that device drivers and extensions contain 3–7 times more bugs than other operating system code and thus are more likely to fail. Therefore, we present a failure-resilient operating system design that can recover from dead drivers and other critical components—primarily through monitoring and replacing malfunctioning components on the fly—transparent to applications and without user intervention. This paper focuses on the post-mortem recovery procedure. We explain the working of our defect detection mechanism, the policy-driven recovery procedure, and post-restart reintegration of the components. Furthermore, we discuss the concrete steps taken to recover from network, block device, and character device driver failures. Finally, we evaluate our design using performance measurements, software fault-injection experiments, and an analysis of the reengineering effort.
Jorrit N. Herder, Herbert Bos, Ben Gras, Philip Ho
Added 02 Jun 2010
Updated 02 Jun 2010
Type Conference
Year 2007
Where DSN
Authors Jorrit N. Herder, Herbert Bos, Ben Gras, Philip Homburg, Andrew S. Tanenbaum
Comments (0)