Abstract. Current solutions for fault-tolerance in HPC systems focus on dealing with the result of a failure. However, most are unable to handle runtime system configuration change...
We propose a novel alternative to application-level overlays called VIOLIN, or Virtual Internetworking on OverLay INfrastructure. Inspired by recent advances in virtual machines, ...
Job scheduling typically focuses on the CPU with little work existing to include I/O or memory. Time-shared execution provides the chance to hide I/O and long-communication latenc...
We propose an analysis for detecting procedures and goals that are deterministic (i.e. that produce at most one solution), or predicates whose clause tests are mutually exclusive (...
We study the problem of optimal preemptive scheduling with respect to a general target function. Given n jobs with associated weights and m ≤ n uniformly related machines, one a...