Processor cores embedded in systems-on-a-chip (SoCs) are often deployed in critical computations, and when affected by faults they may produce dramatic effects. When hardware harde...
This paper discusses measures to make a distributed system based on the Time-Triggered Architecture resistant to arbitrary node failures. To achieve this, the presented approach i...
In a network consisting of several thousands computers, the occurrence of faults is unavoidable. Being able to test the behaviour of a distributed program in an environment where ...
With the increasing number of processors in modern HPC(High Performance Computing) systems, there are two emergent problems to solve. One is scalability, the other is fault tolera...
The limitations of BGP routing in the Internet are often blamed for poor end-to-end performance and prolonged connectivity interruptions. Recent work advocates using overlays to e...
Aditya Akella, Jeffrey Pang, Bruce M. Maggs, Srini...