Sciweavers

SC
2004
ACM

Optimal File-Bundle Caching Algorithms for Data-Grids

13 years 10 months ago
Optimal File-Bundle Caching Algorithms for Data-Grids
The file-bundle caching problem arises frequently in scientific applications where jobs process several files concurrently. Consider a host system in a data-grid that maintains a disk cache for servicing jobs of file requests where a job is serviced only if all its requested files are present in the disk cache. Files must now be admitted into the cache and replaced in sets of file-bundles. We show that traditional caching algorithms based on file popularity measures do not perform well since they may hold in cache non-relevant combinations of files. We present and analyze a new caching algorithm for maximizing the throughput of jobs and minimizing data replacement costs at such data-grid hosts. We tested the new algorithm using a disk cache simulation model under a wide range of conditions of file request distributions, varying cache size, file size distribution, etc. The results show significant improvement over traditional caching algorithms.
Ekow J. Otoo, Doron Rotem, Alexandru Romosan
Added 30 Jun 2010
Updated 30 Jun 2010
Type Conference
Year 2004
Where SC
Authors Ekow J. Otoo, Doron Rotem, Alexandru Romosan
Comments (0)