Sciweavers

BNCOD
2003

External Sorting with On-the-Fly Compression

13 years 5 months ago
External Sorting with On-the-Fly Compression
Evaluating a query can involve manipulation of large volumes of temporary data. When the volume of data becomes too great, activities such as joins and sorting must use disk, and cost minimisation involves complex trade-offs. In this paper, we explore the effect of compression on the cost of external sorting. Reduction in the volume of data potentially allows costs to be reduced (through reductions in disk traffic and numbers of temporary files), but on-the-fly compression can be slow and many compression methods do not allow random access to individual records. We investigate a range of compression techniques for this problem, and develop successful methods based on common letter sequences. Our experiments show that, for a given memory limit, the overheads of compression outweigh the benefits for smaller data volumes, but for large files compression can yield substantial gains, of one-third of costs in the best case tested. Even when the data is stored uncompressed, our results...
John Yiannis, Justin Zobel
Added 31 Oct 2010
Updated 31 Oct 2010
Type Conference
Year 2003
Where BNCOD
Authors John Yiannis, Justin Zobel
Comments (0)