Sciweavers

LCPC
2005
Springer

Lightweight Monitoring of the Progress of Remotely Executing Computations

13 years 10 months ago
Lightweight Monitoring of the Progress of Remotely Executing Computations
Abstract. The increased popularity of grid systems and cycle sharing across organizations requires scalable systems that provide facilities to locate resources, to be fair in the use of those resources, and to monitor jobs executing on remote systems. This paper presents a novel and lightweight approach to monitoring the progress and correctness of a parallel computation on a remote, and potentially fraudulent, host system. We describe a monitoring system that uses a sequence of program counter values to monitor program progress, and compiler techniques that automatically generate the monitoring code. This approach improves on earlier work by omitting the need to duplicate computation, which both simplifies and reduces the overhead of monitoring. Our approach allows dynamic and accountable cycle-sharing across the Internet. Experimental results show that the overhead of our system is negligible and our monitoring approach is scalable.
Shuo Yang, Ali Raza Butt, Y. Charlie Hu, Samuel P.
Added 28 Jun 2010
Updated 28 Jun 2010
Type Conference
Year 2005
Where LCPC
Authors Shuo Yang, Ali Raza Butt, Y. Charlie Hu, Samuel P. Midkiff
Comments (0)