Using a directed acyclic graph (dag) model of algorithms, we solve a problem related to precedenceconstrained multiprocessor schedules for array computations: Given a sequence of ...
: We present a new approach to fault tolerance for High Performance Computing system. Our approach is based on a careful adaptation of the Algorithmic Based Fault Tolerance techniq...
George Bosilca, Remi Delmas, Jack Dongarra, Julien...
This paper examines the job exchange between parallel compute sites in a decentralized Grid scenario. Here, the local scheduling system remains untouched and continues normal oper...
Christian Grimme, Joachim Lepping, Alexander Papas...
Domain-partition (DP) model is a general model for reliability maximization problem under given redundancy. In this paper, an improved DP model is used to formulate a reconfigurati...
When time-triggered (TT) systems are to be deployed for large embedded real-time (RT) control systems in cars and airplanes, one way to overcome bandwidth limitations and achieve ...