Sciweavers

WWW
2008
ACM

Divide, Compress and Conquer: Querying XML via Partitioned Path-Based Compressed Data Blocks

13 years 4 months ago
Divide, Compress and Conquer: Querying XML via Partitioned Path-Based Compressed Data Blocks
We propose a novel Partition Path-Based (PPB) grouping strategy to store compressed XML data in a stream of blocks. In addition, we employ a minimal indexing scheme called Block Statistic Signature (BSS) on the compressed data, which is a simple but effective technique to support evaluation of selection and aggregate XPath queries of the compressed data. We present a formal analysis and empirical study of these techniques. The BSS indexing is first extended into effective Cluster Statistic Signature (CSS) and Multiple-Cluster Statistic Signature (MSS) indexing by establishing more layers of indexes. We analyze how the response time is affected by various parameters involved in our compression strategy such as the data stream block size, the number of cluster layers, and the query selectivity. We also gain further insight about the compression and querying performance by studying the optimal block size in a stream, which leads to the minimum processing cost for queries. The cost model ...
Wilfred Ng, Ho Lam Lau, Aoying Zhou
Added 16 Dec 2010
Updated 16 Dec 2010
Type Journal
Year 2008
Where WWW
Authors Wilfred Ng, Ho Lam Lau, Aoying Zhou
Comments (0)