Sciweavers

DOLAP
1999
ACM

A Cache Filtering Optimisation for Queries to Massive Datasets on Tertiary Storage

13 years 9 months ago
A Cache Filtering Optimisation for Queries to Massive Datasets on Tertiary Storage
We consider a system in which many users run queries to examine subsets of a large object set. The object set is partitioned into files on tape. A single subset of objects will be visited by multiple queries in the workload. This locality of access creates the opportunity for caching on disk. We introduce and evaluate a novel optimisation, cache filtering, in which the ’hot’ objects are automatically extracted from the files that are staged on disk, and then cached separately in new files on disk. Cache filtering can lead to complex situations in the disk cache. We show that these do not prevent effective caching and we introduce a special cache replacement algorithm to maximise efficiency. Through simulations we evaluate the system over a broad range of likely workloads. Depending on workload and system parameters, the cache filtering optimisation yields speedup factors up to 6.
Koen Holtman, Peter van der Stok, Ian Willers
Added 02 Aug 2010
Updated 02 Aug 2010
Type Conference
Year 1999
Where DOLAP
Authors Koen Holtman, Peter van der Stok, Ian Willers
Comments (0)