Search Sciweavers | Sciweavers

392 search results - page 3 / 79

» Fault Tolerance in a DSM Cluster Operating System

click to vote

CCGRID
2003
IEEE

133views Distributed And Parallel Com...» more CCGRID 2003»

Improved Read Performance in a Cost-Effective, Fault-Tolerant Parallel Virtual File System (CEFT-PVFS)

13 years 10 months ago

Download cse.unl.edu

Due to the ever-widening performance gap between processors and disks, I/O operations tend to become the major performance bottleneck of data-intensive applications on modern clus...

Yifeng Zhu, Hong Jiang, Xiao Qin, Dan Feng, David ...

claim paper

Read More »

click to vote

CCGRID
2006
IEEE

131views Distributed And Parallel Com...» more CCGRID 2006»

Proposal of MPI Operation Level Checkpoint/Rollback and One Implementation

13 years 11 months ago

Download icl.cs.utk.edu

With the increasing number of processors in modern HPC(High Performance Computing) systems, there are two emergent problems to solve. One is scalability, the other is fault tolera...

Yuan Tang, Graham E. Fagg, Jack Dongarra

claim paper

Read More »

click to vote

CLUSTER
2003
IEEE

165views Distributed And Parallel Com...» more CLUSTER 2003»

Coordinated Checkpoint versus Message Log for Fault Tolerant MPI

13 years 10 months ago

Download www.cs.utk.edu

— Large Clusters, high availability clusters and Grid deployments often suffer from network, node or operating system faults and thus require the use of fault tolerant programmin...

Aurelien Bouteiller, Pierre Lemarinier, Gér...

claim paper

Read More »

click to vote

SOSP
2009
ACM

179views Operating System» more SOSP 2009»

Upright cluster services

14 years 2 months ago

Download www.sigops.org

The UpRight library seeks to make Byzantine fault tolerance (BFT) a simple and viable alternative to crash fault tolerance for a range of cluster services. We demonstrate UpRight ...

Allen Clement, Manos Kapritsos, Sangmin Lee, Yang ...

claim paper

Read More »

click to vote

CLUSTER
2002
IEEE

121views Distributed And Parallel Com...» more CLUSTER 2002»

Design and Validation of Portable Communication Infrastructure for Fault-Tolerant Cluster Middleware

13 years 10 months ago

Download www.cs.ucla.edu

We describe the communication infrastructure (CI) for our fault-tolerant cluster middleware, which is optimized for two classes of communication: for the applications and for the ...

Ming Li, Wenchao Tao, Daniel Goldberg, Israel Hsu,...

claim paper

Read More »

« Prev « First page 3 / 79 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers