This paper presents an instance based approach to diagnosing failures in computing systems. Owing to the fact that a large portion of occurred failures are repeated ones, our meth...
We have developed Ceph, a distributed file system that provides excellent performance, reliability, and scalability. Ceph maximizes the separation between data and metadata manage...
Sage A. Weil, Scott A. Brandt, Ethan L. Miller, Da...
Safety-critical systems typically operate in unpredictable environments. Requirements for safety and reliability are in conflict with those for real-time responsiveness. Due to un...
Systems that use or serve multimedia data require timely access to data on hard drives. To ensure adequate performance users must either prevent overload of disk resources, or use...
We present Bristlecone, a programming language for robust software systems. Bristlecone applications have two components: a high-level organization description that specifies how t...