Sciweavers

1113 search results - page 1 / 223
» Performance under Failures of DAG-based Parallel Computing
Sort
View
CCGRID
2009
IEEE
13 years 11 months ago
Performance under Failures of DAG-based Parallel Computing
— As the scale and complexity of parallel systems continue to grow, failures become more and more an inevitable fact for solving large-scale applications. In this research, we pr...
Hui Jin, Xian-He Sun, Ziming Zheng, Zhiling Lan, B...
CLUSTER
1999
IEEE
13 years 4 months ago
Simulative performance analysis of gossip failure detection for scalable distributed systems
Three protocols for gossip-based failure detection services in large-scale heterogeneous clusters are analyzed and compared. The basic gossip protocol provides a means by which fai...
Mark W. Burns, Alan D. George, Bradley A. Wallace
CF
2009
ACM
13 years 2 months ago
High accuracy failure injection in parallel and distributed systems using virtualization
Emulation sits between simulation and experimentation to complete the set of tools available for software designers to evaluate their software and predict behavior under condition...
Thomas Hérault, Thomas Largillier, Sylvain ...
HPDC
2012
IEEE
11 years 7 months ago
Understanding the effects and implications of compute node related failures in hadoop
Hadoop has become a critical component in today’s cloud environment. Ensuring good performance for Hadoop is paramount for the wide-range of applications built on top of it. In ...
Florin Dinu, T. S. Eugene Ng
SPAA
2009
ACM
14 years 5 months ago
The weakest failure detector for wait-free dining under eventual weak exclusion
Dining philosophers is a classic scheduling problem for local mutual exclusion on arbitrary conflict graphs. We establish necessary conditions to solve wait-free dining under even...
Srikanth Sastry, Scott M. Pike, Jennifer L. Welch