Sciweavers

282 search results - page 34 / 57
» Reliability and Scheduling on Systems Subject to Failures
Sort
View
HCW
1999
IEEE
15 years 2 months ago
Metacomputing with MILAN
The MILAN project, a joint effort involving Arizona State University and New York University, has produced and validated fundamental techniques for the realization of efficient, r...
Arash Baratloo, Partha Dasgupta, Vijay Karamcheti,...
INFOCOM
1992
IEEE
15 years 2 months ago
A Study on the Inaccessibility Characteristics of ISO 8802/4 Token-Bus LANs
Local area networks have long been established as the basis for distributed systems. Continuity of service and bounded and known message delivery latency are requirements of a num...
José Rufino, Paulo Veríssimo
SIGCOMM
2006
ACM
15 years 3 months ago
Minimizing churn in distributed systems
A pervasive requirement of distributed systems is to deal with churn — change in the set of participating nodes due to joins, graceful leaves, and failures. A high churn rate ca...
Brighten Godfrey, Scott Shenker, Ion Stoica
SIGMETRICS
2009
ACM
134views Hardware» more  SIGMETRICS 2009»
15 years 4 months ago
DRAM errors in the wild: a large-scale field study
Errors in dynamic random access memory (DRAM) are a common form of hardware failure in modern compute clusters. Failures are costly both in terms of hardware replacement costs and...
Bianca Schroeder, Eduardo Pinheiro, Wolf-Dietrich ...
ATAL
2007
Springer
15 years 4 months ago
Diagnosis of plan step errors and plan structure violations
Failures in plan execution can be attributed to errors in the execution of plan steps or violations of the plan structure. The structure of a plan prescribes which actions have to...
Cees Witteveen, Nico Roos, Adriaan ter Mors, Xiaoy...