The productivity of HPC system is determined not only by their performance, but also by their reliability. The conventional method to limit the impact of failures is checkpointing...
The persistent programming systems of the 1980s offered a programming model that integrated computation and long-term storage. In these systems, reliable applications could be eng...
Alan Dearle, Graham N. C. Kirby, Stuart J. Norcros...
On-chip implementation of multiprocessor systems requires the planarization of the interconnect network onto the silicon floorplan. Manual floorplanning approaches will become i...
Security and reliability issues in distributed systems have been investigated for several years at LAAS using a technique called Fragmentation-Redundancy-Scattering (FRS). The aim ...
Abstract--The Grid is an heterogeneous and dynamic environment which enables distributed computation. This makes it a technology prone to failures. Some related work uses replicati...