While monitoring, instrumented long running parallel applications generate huge amount of instrumentation data. Processing and storing this data incurs overhead, and perturbs the ...
We describe our experiences in using Spin to verify parts of the Multi Purpose Daemon (MPD) parallel process management system. MPD is a distributed collection of processes connect...
Olga Shumsky Matlin, Ewing L. Lusk, William McCune
In a Linux cluster, as in any multi-processor system, the inter-processor communication rate is the major limiting factor to its general usefulness. This research is geared toward...
Parallel disk I/O subsystems are becoming more important in today’s large-scale parallel machines. Parallel disk systems provide a significant boost in I/O performance reducing ...
We describe a software architecture for storage services in computational grid environments. Based upon a lightweight message-passing paradigm, the architecture enables the provis...