Diagnosing production run failures is a challenging yet important task. Most previous work focuses on offsite diagnosis, i.e. development site diagnosis with the programmers prese...
Joseph Tucek, Shan Lu, Chengdu Huang, Spiros Xanth...
This paper demonstrates that the dependability of generic, evolving J2EE applications can be enhanced through a combination of a few recovery-oriented techniques. Our goal is to r...
George Candea, Emre Kiciman, Steve Zhang, Pedram K...
—This paper proposes a novel recovery mechanism from large-scale network failures caused by earthquakes, terrorist attacks, large-scale power outages and software bugs. Our metho...
Takuro Horie, Go Hasegawa, Satoshi Kamei, Masayuki...
In today’s IT service market customers urge providers to grant guarantees for quality of service (QoS) which are laid down in Service Level Agreements (SLAs). To satisfy custome...
HARNESS is an experimental metacomputing system that supports dynamic software reconfiguration, both of the resources that comprise the virtual machine and the services provided t...