An important problem in distributed systems is to detect termination of a distributed computation. A computation is said to have terminated when all processes have become passive a...
Distributed systems require strategies to detect and recover from failures. Many protocols for distributed systems employ a strategy based on leases, which grant a leaseholder acc...
In this paper we describe our experience with Teapot [7], a domain-specific language for writing cache coherence protocols. Cache coherence is of concern when parallel and distrib...
Satish Chandra, James R. Larus, Michael Dahlin, Br...
Knowing the program timing characteristics is fundamental to the successful design and execution of real-time systems. A critical timing measure is the worst-case execution time (...
Multi-cluster schedulers can dramatically improve average job turn-around time performance by making use of fragmented node resources available throughout the grid. By carefully m...