Sciweavers

234 search results - page 31 / 47
» Implementation of Fault-Tolerant GridRPC Applications
Sort
View
PVM
2009
Springer
15 years 4 months ago
VolpexMPI: An MPI Library for Execution of Parallel Applications on Volatile Nodes
The objective of this research is to convert ordinary idle PCs into virtual clusters for executing parallel applications. The paper introduces VolpexMPI that is designed to enable ...
Troy LeBlanc, Rakhi Anand, Edgar Gabriel, Jaspal S...
ICSE
2003
IEEE-ACM
15 years 3 months ago
Supporting Dependable Distributed Applications Through a Component-Oriented Middleware-Based Group Service
Abstract. Dependable distributed applications require flexible infrastructure support for controlled redundancy, replication, and recovery of components and services. However, mos...
Katia B. Saikoski, Geoff Coulson
DSN
2004
IEEE
15 years 1 months ago
Cluster-Based Failure Detection Service for Large-Scale Ad Hoc Wireless Network Applications
The growing interest in ad hoc wireless network applications that are made of large and dense populations of lightweight system resources calls for scalable approaches to fault to...
Ann T. Tai, Kam S. Tso, William H. Sanders
ISSRE
2007
IEEE
14 years 11 months ago
Towards Self-Protecting Enterprise Applications
Enterprise systems must guarantee high availability and reliability to provide 24/7 services without interruptions and failures. Mechanisms for handling exceptional cases and impl...
Davide Lorenzoli, Leonardo Mariani, Mauro Pezz&egr...
HPDC
2009
IEEE
15 years 4 months ago
Interconnect agnostic checkpoint/restart in open MPI
Long running High Performance Computing (HPC) applications at scale must be able to tolerate inevitable faults if they are to harness current and future HPC systems. Message Passi...
Joshua Hursey, Timothy Mattox, Andrew Lumsdaine