— Today’s system monitoring tools are capable of detecting system failures such as host failures, OS errors, and network partitions in near-real time. Unfortunately, the same c...
Dan Gunter, Brian Tierney, Aaron Brown, D. Martin ...
In high energy physics, bioinformatics, and other disciplines, we encounter applications involving numerous, loosely coupled jobs that both access and generate large data sets. So...
Abstract. We consider the problem of estimating vector-valued variables from noisy "relative" measurements. The measurement model can be expressed in terms of a graph, wh...
Prabir Barooah, Neimar Machado da Silva, Joã...
Abstract. In this paper, we address job scheduling in Distributed Computing Infrastructures, that is a loosely coupled network of autonomous acting High Performance Computing syste...
Large-scale distributed systems provide an attractive scalable infrastructure for network applications. However, the loosely-coupled nature of this environment can make data acces...