Sciweavers

535 search results - page 13 / 107
» Fault tolerant high performance computing by a coding approa...
Sort
View
100
Voted
ECOWS
2010
Springer
14 years 10 months ago
Shepherd: node monitors for fault-tolerant distributed process execution in OSIRIS
OSIRIS is a middleware for the composition and orchestration of distributed web services that follows a P2P decentralized approach to process execution, providing already some deg...
Diego Milano, Nenad Stojnic
TC
2010
14 years 10 months ago
PERFECTORY: A Fault-Tolerant Directory Memory Architecture
—The number of CPUs in chip multiprocessors is growing at the Moore’s Law rate, due to continued technology advances. However, new technologies pose serious reliability challen...
Hyunjin Lee, Sangyeun Cho, Bruce R. Childers
125
Voted
TDSC
2011
14 years 7 months ago
RITAS: Services for Randomized Intrusion Tolerance
— Randomized agreement protocols have been around for more than two decades. Often assumed to be inefficient due to their high expected communication and computation complexitie...
Henrique Moniz, Nuno Ferreira Neves, Miguel Correi...
CAL
2006
15 years 15 days ago
A case for fault tolerance and performance enhancement using chip multi-processors
This paper makes a case for using multi-core processors to simultaneously achieve transient-fault tolerance and performance enhancement. Our approach is extended from a recent late...
Huiyang Zhou
111
Voted
TC
2008
15 years 10 days ago
Adaptive Fault Management of Parallel Applications for High-Performance Computing
As the scale of high-performance computing (HPC) continues to grow, failure resilience of parallel applications becomes crucial. In this paper, we present FT-Pro, an adaptive fault...
Zhiling Lan, Yawei Li