Abstract. Distributed applications running on clusters may be composed of several components with very different performance requirements. The FlowVR middleware allows the develop...
As the scale of high-performance computing (HPC) continues to grow, failure resilience of parallel applications becomes crucial. In this paper, we present FT-Pro, an adaptive fault...
The necessity of devising novel thread-level speculation (TLS) techniques has become extremely important with the growing acceptance of multi-core architectures by the industry. H...
Parallel computing is becoming increasing central and mainstream, driven both by the widespread availability of commodity SMP and high-performance cluster platforms, as well as th...
For a grid middleware to perform resource allocation, prediction models are needed, which can determine how long an application will take for completion on a particular platform o...