Sciweavers

562 search results - page 14 / 113
» Mapping a Fault-Tolerant Distributed Algorithm to Systems on...
Sort
View
PDPTA
2003
14 years 11 months ago
Data Integrity in a Distributed Storage System
Distributed storage systems must provide highly available access to data while maintaining high performance and maximum scalability. In addition, reliability in a storage system is...
Jonathan D. Bright, John A. Chandy
ICPADS
2002
IEEE
15 years 2 months ago
Sago: A Network Resource Management System for Real-Time Content Distribution
Abstract— Content replication and distribution is an effective technology to reduce the response time for web accesses and has been proven quite popular among large Internet cont...
Tzi-cker Chiueh, Kartik Gopalan, Anindya Neogi, Ch...
CCGRID
2006
IEEE
15 years 3 months ago
Proposal of MPI Operation Level Checkpoint/Rollback and One Implementation
With the increasing number of processors in modern HPC(High Performance Computing) systems, there are two emergent problems to solve. One is scalability, the other is fault tolera...
Yuan Tang, Graham E. Fagg, Jack Dongarra
ECOWS
2010
Springer
14 years 7 months ago
Shepherd: node monitors for fault-tolerant distributed process execution in OSIRIS
OSIRIS is a middleware for the composition and orchestration of distributed web services that follows a P2P decentralized approach to process execution, providing already some deg...
Diego Milano, Nenad Stojnic
DAC
2011
ACM
13 years 9 months ago
DRAIN: distributed recovery architecture for inaccessible nodes in multi-core chips
As transistor dimensions continue to scale deep into the nanometer regime, silicon reliability is becoming a chief concern. At the same time, transistor counts are scaling up, ena...
Andrew DeOrio, Konstantinos Aisopos, Valeria Berta...