Efficient and Scalable Sequence-Based XML Filtering

10 years 9 months ago
The ubiquitous adoption of XML as the standard of data exchange over the web has led to increased interest in building efficient and scalable XML publish-subscribe (pub-sub) systems. The central function of an XML-based pub-sub system is to perform XML filtering efficiently, i.e. identify those XPath expressions that have a match in a streaming XML document. In this paper, we propose a new sequence-based approach, which transforms both XML documents and XPath twig expressions into Node Encoded Tree Sequences (NETS). In terms of this encoding, we provide a necessary and sufficient condition for an XPath twig to represent a match in a given XML document. The proposed filtering procedure is based on a new subsequence matching algorithm devised for NETS, which identifies the set of matched queries free of false positives with a single scan of the XML document. Extensive experimental results show that the NETS method outperforms previous XML filtering approaches.
Mariam Salloum, Vassilis J. Tsotras
Added 25 May 2010
Updated 25 May 2010
Type Conference
Year 2009
