A major hurdle facing data intensive grid applications is the appropriate handling of failures that occur in the grid-environment. Implementing the fault-tolerance transparently a...
Power and energy are first-order design constraints in high performance computing. Current research using dynamic voltage scaling (DVS) relies on trading increased execution time...
Barry Rountree, David K. Lowenthal, Bronis R. de S...
Abstract—While measures such as raw compute performance and system capacity continue to be important factors for evaluating cluster performance, such issues as system reliability...
William M. Jones, John T. Daly, Nathan DeBardelebe...
As Internet applications become larger and more complex, the task of managing them becomes overwhelming. “Abnormal” events such as software updates, failures, attacks, and hots...
Peter Van Roy, Seif Haridi, Alexander Reinefeld, J...
Distributed systems are difficult to design and develop. The difficulties arise both in basic safety correctness properties, and in achieving high performance. As a result of this...