Sciweavers

OSDI
2008
ACM

CuriOS: Improving Reliability through Operating System Structure

14 years 5 months ago
CuriOS: Improving Reliability through Operating System Structure
An error that occurs in a microkernel operating system service can potentially result in state corruption and service failure. A simple restart of the failed service is not always the best solution for reliability. Blindly restarting a service which maintains client-related state such as session information results in the loss of this state and affects all clients that were using the service. CuriOS represents a novel OS design that uses lightweight distribution, isolation and persistence of OS service state to mitigate the problem of state loss during a restart. The design also significantly reduces error propagation within client-related state maintained by an OS service. This is achieved by encapsulating services in separate protection domains and granting access to client-related state only when required for request processing. Fault injection experiments show that it is possible to recover from between 87% and 100% of manifested errors in OS services such as the file system, netw...
Francis M. David, Ellick Chan, Jeffrey C. Carlyle,
Added 03 Dec 2009
Updated 03 Dec 2009
Type Conference
Year 2008
Where OSDI
Authors Francis M. David, Ellick Chan, Jeffrey C. Carlyle, Roy H. Campbell
Comments (0)