Search Sciweavers | Sciweavers

12

IPPS
2007
IEEE

129views Distributed And Parallel Com...» more IPPS 2007»

A Fault Tolerance Protocol with Fast Fault Recovery

13 years 11 months ago

Fault tolerance is an important issue for large machines with tens or hundreds of thousands of processors. Checkpoint-based methods, currently used on most machines, rollback all ...

Sayantan Chakravorty, Laxmikant V. Kalé

claim paper

Read More »

14

click to vote

ATAL
2009
Springer

226views Intelligent Agents» more ATAL 2009»

Combining fault injection and model checking to verify fault tolerance in multi-agent systems

13 years 12 months ago

Download www.aamas-conference.org

The ability to guarantee that a system will continue to operate correctly under degraded conditions is key to the success of adopting multi-agent systems (MAS) as a paradigm for d...

Jonathan Ezekiel, Alessio Lomuscio

claim paper

Read More »

16

click to vote

CLUSTER
2006
IEEE

121views Distributed And Parallel Com...» more CLUSTER 2006»

Autonomous recovery in componentized Internet applications

13 years 5 months ago

Download infoscience.epfl.ch

In this paper we show how to reduce downtime of J2EE applications by rapidly and automatically recovering from transient and intermittent software failures, without requiring appl...

George Candea, Emre Kiciman, Shinichi Kawamoto, Ar...

claim paper

Read More »

11

click to vote

PVM
2010
Springer

123views Distributed And Parallel Com...» more PVM 2010»

Dodging the Cost of Unavoidable Memory Copies in Message Logging Protocols

13 years 3 months ago

Download icl.cs.utk.edu

Abstract. With the number of computing elements spiraling to hundred of thousands in modern HPC systems, failures are common events. Few applications are nevertheless fault toleran...

George Bosilca, Aurelien Bouteiller, Thomas H&eacu...

claim paper

Read More »

16

click to vote

ENTCS
2002

115views more ENTCS 2002»

Component-Based Applications: A Dynamic Reconfiguration Approach with Fault Tolerance Support

13 years 5 months ago

Download www.dimap.ufrn.br

This paper presents a mechanism for dynamic reconfiguration of component-based applications and its fault tolerance strategy. The mechanism, named generic connector, allows compos...

Thaís Vasconcelos Batista, Milano Gadelha C...

claim paper

Read More »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers