Dynamic fault-tolerance management (DFTM) was previously introduced as a means of providing environmentand workload-driven adaptation for failure-prone battery powered systems. Th...
Autonomous robots offer alluring perspectives in numerous application domains: space rovers, satellites, medical assistants, tour guides, etc. However, a severe lack of trust in t...
As the scale of cluster computing grows, it is becoming hard for long-running applications to complete without facing failures on large-scale clusters. To address this issue, chec...
Emerging VLSI technologies and platforms are giving rise to systems with inherently high potential for runtime failure. Such failures range from intermittent electrical and mechan...
This paper presents the design and implementation of the Distributed Autonomous Replication Management (DARM) framework built on top of the Spread group communication system. The ...