To be able to fully exploit ever larger computing platforms, modern HPC applications and system software must be able to tolerate inevitable faults. Historically, MPI implementati...
Joshua Hursey, Jeffrey M. Squyres, Timothy Mattox,...
The paper describes a metaobject architecture for distributed fault tolerant systems. Basically metaobject protocols enables functional objects to be independent from meta-function...
This paper deals with multiprocessor systems required to provide both high performance and good figures of dependability attributes. Fault tolerance is pursued through a proper co...
Felicita Di Giandomenico, Silvano Chiaradonna, And...
— In this paper, we study the survivability problem in hierarchical wireless access networks with dual-homed end users, who are connected to two base stations (BSs), a primary BS...
In this paper, a task parallel application is implemented with Ninf-G which is a GridRPC system, and experimented on, using the Grid testbed in Asia Pacific, for three months. The...