— Throughput maximization in a packet switched wireless communication system is considered in this paper. The channel variation is accounted for by modeling the channel as a fin...
— OpenMP can be supported in cluster environments by using distributed shared memory (DSM) systems. A portable approach for building DSM systems is to layer it on MPI. With these...
Given the scale of massively parallel systems, occurrence of faults is no longer an exception but a regular event. Periodic checkpointing is becoming increasingly important in the...
Abstract--We present a fault tolerant task pool execution environment that is capable of performing fine-grain selective restart using a lightweight, distributed task completion tr...
James Dinan, Arjun Singri, P. Sadayappan, Sriram K...
A major contributing factor to the complexity of creating and evolving distributed systems is the tangling of middleware-specific functionality with core business functionality in...