Sciweavers

2400 search results - page 273 / 480
» Systems Failures
Sort
View
INFOCOM
2007
IEEE
15 years 10 months ago
Network Coding for Distributed Storage Systems
—Distributed storage systems provide reliable access to data through redundancy spread over individually unreliable nodes. Application scenarios include data centers, peer-to-pee...
Alexandros G. Dimakis, Brighten Godfrey, Martin J....
DC
2008
15 years 3 months ago
On implementing omega in systems with weak reliability and synchrony assumptions
We study the feasibility and cost of implementing --a fundamental failure detector at the core of many algorithms--in systems with weak reliability and synchrony assumptions. Intui...
Marcos Kawazoe Aguilera, Carole Delporte-Gallet, H...
ICDCS
2008
IEEE
15 years 10 months ago
Can We Really Recover Data if Storage Subsystem Fails?
This paper presents a theoretical and experimental study on the limitations of copy-on-write snapshots and incremental backups in terms of data recoverability. We provide mathemat...
Weijun Xiao, Qing Yang
SIGSOFT
2010
ACM
15 years 1 months ago
Finding latent performance bugs in systems implementations
Robust distributed systems commonly employ high-level recovery mechanisms enabling the system to recover from a wide variety of problematic environmental conditions such as node f...
Charles Edwin Killian, Karthik Nagaraj, Salman Per...
USENIX
1990
15 years 5 months ago
Implementation of the Ficus Replicated File System
As we approach nation-wide integration of computer systems, it is clear that le replication will play a key role, both to improve data availability in the face of failures, and to...
Richard G. Guy, John S. Heidemann, Wai-Kei Mak, Th...