This paper addresses the problem of selection and discovery of a consistent availability monitoring overlay for computer hosts in a large-scale distributed application, where host...
In this paper, we present the design and implementation of a distributed sensor network application for embedded, isolated-word, real-time speech recognition. In our system design...
Chung-Ching Shen, William Plishker, Shuvra S. Bhat...
Power consumption is a troublesome design constraint for emergent systems such as IBM’s BlueGene /L. If current trends continue, future petaflop systems will require 100 megawat...
Parallel and distributed systems are representative of large and complex systems that require the application of formal methods. These systems are often unreliable because implemen...
Victoria Chernyakhovsky, Peter Frey, Radharamanan ...
We present design and implementation details as well as performance results for two new parallel checkpointing libraries developed by us for parallel MPI applications. The first o...