Sciweavers

BNCOD
2007

Indexing and Searching XML Documents Based on Content and Structure Synopses

13 years 6 months ago
Indexing and Searching XML Documents Based on Content and Structure Synopses
We present a novel framework for indexing and searching schema-less XML documents based on concise summaries of their structural and textual content. Our search query language is XPath extended with full-text search. We introduce two novel data synopsis structures that correlate textual with positional information in an XML document and improves query precision. In addition, we present a two-phase containment filtering algorithm based on these synopses that improves the searching process. Our experimental evaluation shows that our data synopses indexing scheme outperforms the standard XML indexing scheme based on inverted lists; the query evaluation based on our data synopses is more accurate than related approximate approaches that do not consider positional information; our two-phase containment filtering algorithm is more efficient than a single-phase brute force algorithm.
Weimin He, Leonidas Fegaras, David Levine
Added 29 Oct 2010
Updated 29 Oct 2010
Type Conference
Year 2007
Where BNCOD
Authors Weimin He, Leonidas Fegaras, David Levine
Comments (0)