Sciweavers

1113 search results - page 40 / 223
» Performance under Failures of DAG-based Parallel Computing
Sort
View
125
Voted
GRID
2006
Springer
15 years 1 months ago
Operating System Support for Space Allocation in Grid Storage Systems
Abstract-- Shared temporary storage space is often the constraining resource for clusters that serve as execution nodes in wide-area distributed systems. At least one large nationa...
Douglas Thain
JPDC
2006
112views more  JPDC 2006»
15 years 1 months ago
CEFT: A cost-effective, fault-tolerant parallel virtual file system
The vulnerability of computer nodes due to component failures is a critical issue for cluster-based file systems. This paper studies the development and deployment of mirroring in...
Yifeng Zhu, Hong Jiang
PVM
2010
Springer
15 years 5 days ago
Dodging the Cost of Unavoidable Memory Copies in Message Logging Protocols
Abstract. With the number of computing elements spiraling to hundred of thousands in modern HPC systems, failures are common events. Few applications are nevertheless fault toleran...
George Bosilca, Aurelien Bouteiller, Thomas H&eacu...
HIPC
2000
Springer
15 years 5 months ago
Meta-data Management System for High-Performance Large-Scale Scientific Data Access
Many scientific applications manipulate large amount of data and, therefore, are parallelized on high-performance computing systems to take advantage of their computational power a...
Wei-keng Liao, Xiaohui Shen, Alok N. Choudhary
HPCA
2011
IEEE
14 years 5 months ago
Beyond block I/O: Rethinking traditional storage primitives
Over the last twenty years the interfaces for accessing persistent storage within a computer system have remained essentially unchanged. Simply put, seek, read and write have deļ¬...
Xiangyong Ouyang, David W. Nellans, Robert Wipfel,...