Sciweavers

ISCC
2005
IEEE
13 years 10 months ago
Optimizing the Reliable Distribution of Large Files within CDNs
Abstract Content Delivery Networks (CDNs) provide an efficient support for serving http and streaming media content while minimizing the network impact of content delivery as well...
Ludmila Cherkasova
IPPS
2005
IEEE
13 years 10 months ago
Fault-Tolerant Parallel Applications with Dynamic Parallel Schedules
Commodity computer clusters are often composed of hundreds of computing nodes. These generally off-the-shelf systems are not designed for high reliability. Node failures therefore...
Sebastian Gerlach, Roger D. Hersch
INFOCOM
2005
IEEE
13 years 10 months ago
The one-to-many TCP overlay: a scalable and reliable multicast architecture
Abstract— We consider reliable multicast in overlay networks where nodes have finite-size buffers and are subject to failures. We address issues of end-to-end reliability and th...
François Baccelli, Augustin Chaintreau, Zhe...
DSN
2005
IEEE
13 years 10 months ago
A Framework for Node-Level Fault Tolerance in Distributed Real-Time Systems
This paper describes a framework for achieving node-level fault tolerance (NLFT) in distributed realtime systems. The objective of NLFT is to mask errors at the node level in orde...
Joakim Aidemark, Peter Folkesson, Johan Karlsson
IPPS
2006
IEEE
13 years 10 months ago
Algorithm-based checkpoint-free fault tolerance for parallel matrix computations on volatile resources
As the desire of scientists to perform ever larger computations drives the size of today’s high performance computers from hundreds, to thousands, and even tens of thousands of ...
Zizhong Chen, Jack Dongarra
IPPS
2006
IEEE
13 years 10 months ago
On consistency maintenance in service discovery
Communication and node failures degrade the ability of a service discovery protocol to ensure Users receive the correct service information when the service changes. We propose th...
Vasughi Sundramoorthy, Pieter H. Hartel, Hans Scho...
PRDC
2007
IEEE
13 years 11 months ago
Implementation of a Flexible Membership Protocol on a Real-Time Ethernet Prototype
This paper describes the implementation of a processorgroup membership protocol in an experimental real-time network. The protocol is appropriate for fault-tolerant distributed sy...
Raul Barbosa, António Ferreira, Johan Karls...
ICDCS
2007
IEEE
13 years 11 months ago
Protocol Design for Dynamic Delaunay Triangulation
Delaunay triangulation (DT) is a useful geometric structure for networking applications. In this paper we investigate the design of join, leave, and maintenance protocols to const...
Dong-Young Lee, Simon S. Lam
HASE
2007
IEEE
13 years 11 months ago
Scalable, Adaptive, Time-Bounded Node Failure Detection
This paper presents a scalable, adaptive and timebounded general approach to assure reliable, real-time Node-Failure Detection (NFD) for large-scale, high load networks comprised ...
Matthew Gillen, Kurt Rohloff, Prakash Manghwani, R...
DSN
2007
IEEE
13 years 11 months ago
R-Sentry: Providing Continuous Sensor Services against Random Node Failures
The success of sensor-driven applications is reliant on whether a steady stream of data can be provided by the underlying system. This need, however, poses great challenges to sen...
Shengchao Yu, Yanyong Zhang