As Internet applications become larger and more complex, the task of managing them becomes overwhelming. “Abnormal” events such as software updates, failures, attacks, and hots...
Peter Van Roy, Seif Haridi, Alexander Reinefeld, J...
In order to be economically feasible and to offer high levels of availability and performance, large scale distributed systems depend on the automation of repair services. While t...
Using grid resources to execute scientific applications requiring a large amount of computing power is attractive but not easy from the user point of view. Vigne is a grid operati...
Emmanuel Jeanvoine, Louis Rilling, Christine Morin...
This paper studies the problem of realizing a common software clock among a large set of nodes without an external time reference (i.e., internal clock synchronization), any centr...
We consider storage in an extremely large-scale distributed computer system designed for stream processing applications. In such systems, incoming data and intermediate results ma...
Kirsten Hildrum, Fred Douglis, Joel L. Wolf, Phili...