Sciweavers

5934 search results - page 81 / 1187
» Detecting a Network Failure
Sort
View
TPDS
2008
89views more  TPDS 2008»
15 years 3 months ago
Algorithm-Based Fault Tolerance for Fail-Stop Failures
Fail-stop failures in distributed environments are often tolerated by checkpointing or message logging. In this paper, we show that fail-stop process failures in ScaLAPACK matrix ...
Zizhong Chen, Jack Dongarra
JSS
2007
78views more  JSS 2007»
15 years 3 months ago
Understanding failure response in service discovery systems
Service discovery systems enable distributed components to find each other without prior arrangement, to express capabilities and needs, to aggregate into useful compositions, an...
Christopher Dabrowski, Kevin Mills, Stephen Quirol...
SRDS
2006
IEEE
15 years 9 months ago
Recovering from Distributable Thread Failures with Assured Timeliness in Real-Time Distributed Systems
We consider the problem of recovering from failures of distributable threads with assured timeliness. When a node hosting a portion of a distributable thread fails, it causes orph...
Edward Curley, Jonathan Stephen Anderson, Binoy Ra...
120
Voted
HICSS
2005
IEEE
170views Biometrics» more  HICSS 2005»
15 years 9 months ago
Low-Bandwidth Topology Maintenance for Robustness in Structured Overlay Networks
— Structured peer-to-peer systems have emerged as infrastructures for resource sharing in large-scale, distributed, and dynamic environments. One challenge in these systems is to...
Ali Ghodsi, Luc Onana Alima, Seif Haridi
CACM
1999
92views more  CACM 1999»
15 years 3 months ago
Putting OO Distributed Programming to Work
stractions underlying distributed computing. We attempted to keep our preaims at an abstract and general level. In this column, we make those claims more concrete. More precisely, ...
Pascal Felber, Rachid Guerraoui, Mohamed Fayad