Sciweavers

282 search results - page 12 / 57
» Reliability and Scheduling on Systems Subject to Failures
Sort
View
IPPS
2005
IEEE
15 years 3 months ago
Fault-Tolerant Parallel Applications with Dynamic Parallel Schedules
Commodity computer clusters are often composed of hundreds of computing nodes. These generally off-the-shelf systems are not designed for high reliability. Node failures therefore...
Sebastian Gerlach, Roger D. Hersch
DSN
2005
IEEE
15 years 3 months ago
Probabilistic QoS Guarantees for Supercomputing Systems
Supercomputing systems must be able to reliably and efficiently complete their assigned workloads, even in the presence of failures. This paper proposes a system that allows the ...
Adam J. Oliner, Larry Rudolph, Ramendra K. Sahoo, ...
TECS
2010
74views more  TECS 2010»
14 years 8 months ago
Recovering from distributable thread failures in distributed real-time Java
We consider the problem of recovering from failures of distributable threads (“threads”) in distributed realtime systems that operate under run-time uncertainties including th...
Edward Curley, Binoy Ravindran, Jonathan Stephen A...
CODES
2007
IEEE
15 years 4 months ago
Improved response time analysis of tasks scheduled under preemptive Round-Robin
Round-Robin scheduling is the most popular time triggered scheduling policy, and has been widely used in communication networks for the last decades. It is an efficient schedulin...
Razvan Racu, Li Li, Rafik Henia, Arne Hamann, Rolf...
SOSP
2005
ACM
15 years 6 months ago
IRON file systems
Commodity file systems trust disks to either work or fail completely, yet modern disks exhibit more complex failure modes. We suggest a new fail-partial failure model for disks, ...
Vijayan Prabhakaran, Lakshmi N. Bairavasundaram, N...