The potential for faults in distributed computing systems is a significant complicating factor for application developers. While a variety of techniques exist for detecting and co...
Paul Stelling, Ian T. Foster, Carl Kesselman, Crai...
Executing parallel applications across distributed networks introduces the problem of fault tolerance. A viable solution for fault tolerance must keep overhead manageable and not c...
The global computational grids bring together distributed computation/communication resources. Beyond this, we envision the emergence of global `service grids', which provide...
Modern scientific computing involves organizing, moving, visualizing, and analyzing massive amounts of data from around the world, as well as employing largescale computation. The...
Detecting network path anomalies generally requires examining large volumes of traffic data to find misbehavior. We observe that wide-area services, such as peerto-peer systems an...
Ming Zhang, Chi Zhang, Vivek S. Pai, Larry L. Pete...