Programmers of large-scale trusted systems need tools to simplify tasks such as replicating services or data. Group communication systems achieve this via various flavors of relia...
Failure detectors (or, more accurately Failure Suspectors { FS) appear to be a fundamental service upon which to build fault-tolerant, distributed applications. This paper shows t...
This paper introduces a self-configuring architecture for scaling the database tier of dynamic content web servers. We use a unified approach to load and fault management based ...
Gokul Soundararajan, Kaloian Manassiev, Jin Chen, ...
Grid software developers and Grid site administrators both require realistic testbeds where they can test applications and middleware before deployment on production infrastructur...
Stephen Childs, Brian A. Coghlan, Jason McCandless
Abstract—Significant achievements have been made for automated allocation of cloud resources. However, the performance of applications may be poor in peak load periods, unless t...
Nicolas Bonvin, Thanasis G. Papaioannou, Karl Aber...