Search Sciweavers | Sciweavers

37 search results - page 5 / 8

» High performance linpack benchmark: a fault tolerant impleme...

click to vote

CCGRID
2006
IEEE

125views Distributed And Parallel Com...» more CCGRID 2006»

Exploit Failure Prediction for Adaptive Fault-Tolerance in Cluster Computing

13 years 11 months ago

Download www.cs.iit.edu

As the scale of cluster computing grows, it is becoming hard for long-running applications to complete without facing failures on large-scale clusters. To address this issue, chec...

Yawei Li, Zhiling Lan

claim paper

Read More »

click to vote

ASPLOS
2006
ACM

152views Programming Languages» more ASPLOS 2006»

Understanding prediction-based partial redundant threading for low-overhead, high- coverage fault tolerance

13 years 11 months ago

Download www4.ncsu.edu

Redundant threading architectures duplicate all instructions to detect and possibly recover from transient faults. Several lighter weight Partial Redundant Threading (PRT) archite...

Vimal K. Reddy, Eric Rotenberg, Sailashri Parthasa...

claim paper

Read More »

click to vote

NSDI
2010

180views Computer Networks» more NSDI 2010»

Prophecy: Using History for High-Throughput Fault Tolerance

13 years 6 months ago

Download www.cs.princeton.edu

Byzantine fault-tolerant (BFT) replication has enjoyed a series of performance improvements, but remains costly due to its replicated work. We eliminate this cost for read-mostly ...

Siddhartha Sen, Wyatt Lloyd, Michael J. Freedman

claim paper

Read More »

click to vote

HPDC
2009
IEEE

101views Distributed And Parallel Com...» more HPDC 2009»

Interconnect agnostic checkpoint/restart in open MPI

14 years 1 days ago

Download www.osl.iu.edu

Long running High Performance Computing (HPC) applications at scale must be able to tolerate inevitable faults if they are to harness current and future HPC systems. Message Passi...

Joshua Hursey, Timothy Mattox, Andrew Lumsdaine

claim paper

Read More »

click to vote

SOSP
2007
ACM

240views Operating System» more SOSP 2007»

Tolerating byzantine faults in transaction processing systems using commit barrier scheduling

14 years 2 months ago

Download people.csail.mit.edu

This paper describes the design, implementation, and evaluation of a replication scheme to handle Byzantine faults in transaction processing database systems. The scheme compares ...

Ben Vandiver, Hari Balakrishnan, Barbara Liskov, S...

claim paper

Read More »

« Prev « First page 5 / 8 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers