Current trends suggest future software systems will rely on service-discovery protocols to combine and recombine distributed services dynamically in reaction to changing condition...
Christopher Dabrowski, Kevin L. Mills, Andrew L. R...
Automatic identification of software faults has enormous practical significance. This requires characterizing program execution behavior. Equally important is the aspect of diagno...
Recovery systems must save state before a failure occurs to enable the system to recover from the failure. However, recovery will fail if the recovery system saves any state corru...
A significant fraction of software failures in large-scale Internet systems are cured by rebooting, even when the exact failure causes are unknown. However, rebooting can be expen...
George Candea, Shinichi Kawamoto, Yuichi Fujiki, G...
Unplanned system outages have a negative impact on company revenues and image. While the last decades have seen a lot of efforts from industry and academia to avoid them, they stil...
Javier Alonso, Jordi Torres, Josep Lluis Berral, R...