CX, a network-based computational exchange, is presented. The system’s design integrates variations of ideas from other researchers, such as work stealing, non-blocking tasks, e...
Device scaling trends dramatically increase the susceptibility of microprocessors to soft errors. Further, mounting demand for embedded microprocessors in a wide array of safety c...
Jason A. Blome, Shantanu Gupta, Shuguang Feng, Sco...
Software bugs in routers lead to network outages, security vulnerabilities, and other unexpected behavior. Rather than simply crashing the router, bugs can violate protocol semant...
Eric Keller, Minlan Yu, Matthew Caesar, Jennifer R...
Providing reliable fault-tolerant sensors is a challenge for distributed systems. The demonstration setup combines three sensors and allows to inject different faults that are rel...
Fault tolerance is a very important concern for critical high performance applications using the MPI library. Several protocols provide automatic and transparent fault detection a...
Pierre Lemarinier, Aurelien Bouteiller, Thomas H&e...