A large number of user requests execute (often concurrently) within a server system. A single request may exhibit fluctuating hardware characteristics (such as instruction comple...
The threat of soft error induced system failure in high performance computing systems has become more prominent, as we adopt ultra-deep submicron process technologies. In this pap...
—Clusters and applications continue to grow in size while their mean time between failure (MTBF) is getting smaller. Checkpoint/Restart is becoming increasingly important for lar...
Database Management Systems (DBMSs) that can be tailored to specific requirements offer the potential to improve reliability and maintainability and simultaneously the ability t...
Florian Irmert, Frank Lauterwald, Christoph P. Neu...
Hardware support for dynamic analysis can minimize the performance overhead of useful applications such as security checks, debugging, and profiling. To eliminate implementation ...