Sciweavers

1113 search results - page 118 / 223
» Performance under Failures of DAG-based Parallel Computing
Sort
View
IPPS
2009
IEEE
15 years 9 months ago
Design, implementation, and evaluation of transparent pNFS on Lustre
Parallel NFS (pNFS) is an emergent open standard for parallelizing data transfer over a variety of I/O protocols. Prototypes of pNFS are actively being developed by industry and a...
Weikuan Yu, Oleg Drokin, Jeffrey S. Vetter
EUROPAR
2003
Springer
15 years 7 months ago
FOBS: A Lightweight Communication Protocol for Grid Computing
The advent of high-performance networks in conjunction with low-cost, powerful computational engines has made possible the development of a new set of technologies termed computat...
Phillip M. Dickens
ICPP
2009
IEEE
15 years 9 months ago
Accelerating Checkpoint Operation by Node-Level Write Aggregation on Multicore Systems
—Clusters and applications continue to grow in size while their mean time between failure (MTBF) is getting smaller. Checkpoint/Restart is becoming increasingly important for lar...
Xiangyong Ouyang, Karthik Gopalakrishnan, Dhabales...
SC
2005
ACM
15 years 8 months ago
Performance-constrained Distributed DVS Scheduling for Scientific Applications on Power-aware Clusters
Left unchecked, the fundamental drive to increase peak performance using tens of thousands of power hungry components will lead to intolerable operating costs and failure rates. H...
Rong Ge, Xizhou Feng, Kirk W. Cameron
CCGRID
2010
IEEE
14 years 9 months ago
Towards Autonomic Service Provisioning Systems
This paper discusses our experience in building SPIRE, an autonomic system for service provision. The architecture consists of a set of hosted Web Services subject to QoS constrain...
Michele Mazzucco