Sciweavers

253 search results - page 16 / 51
» Increasing the availability provided by RADIC with low overh...
Sort
View
HIPC
2007
Springer
15 years 4 months ago
A Scalable Asynchronous Replication-Based Strategy for Fault Tolerant MPI Applications
As computational clusters increase in size, their mean-time-to-failure reduces. Typically checkpointing is used to minimize the loss of computation. Most checkpointing techniques, ...
John Paul Walters, Vipin Chaudhary
CORR
2010
Springer
236views Education» more  CORR 2010»
14 years 5 months ago
Precise, Scalable and Online Request Tracing for Multi-tier Services of Black Boxes
As more and more multi-tier services are developed from commercial off-the-shelf components or heterogeneous middleware without source code available, both developers and administr...
Bo Sang, Jianfeng Zhan, Zhihong Zhang, Lei Wang, D...
BROADNETS
2006
IEEE
15 years 4 months ago
Transparent Optimization of Grid Server Selection With Real-Time Passive Network Measurements
Grid services have tremendously simplified the programming challenges in leveraging large-scale distributed comAt the same time, the increased level of abstraction reduces the op...
Marcia Zangrilli, Bruce Lowekamp
FPL
2005
Springer
112views Hardware» more  FPL 2005»
15 years 3 months ago
Defect-Tolerant FPGA Switch Block and Connection Block with Fine-Grain Redundancy for Yield Enhancement
Future process nodes have such small feature sizes that there will be an increase in the number of manufacturing defects per die. For large FPGAs, it will be critical to tolerate ...
Anthony J. Yu, Guy G. Lemieux
PPOPP
2005
ACM
15 years 4 months ago
Trust but verify: monitoring remotely executing programs for progress and correctness
The increased popularity of grid systems and cycle sharing across organizations requires scalable systems that provide facilities to locate resources, to be fair in the use of tho...
Shuo Yang, Ali Raza Butt, Y. Charlie Hu, Samuel P....