Sciweavers

ICDE
2003
IEEE

PIX: A System for Phrase Matching in XML Documents

14 years 5 months ago
PIX: A System for Phrase Matching in XML Documents
We present a system that enables flexible and efficient phrase matching in XML documents. Since XML allows structured and unstructured information to be interleaved, phrase matching in XML raises new challenges. Our system, named PIX, permits phrase matching in XML documents that contain "mixed content". A key feature of PIX is that users can specify which element and content to ignore when matching a phrase. PIX uses inverted indices and an efficient evaluation algorithm to compute the set of matches and returns answers where phrases, ignored tags and content are highlighted. In addition, query answers are sorted using a ranking function. PIX is implemented as an extension of GALAX, a full-fledged XQuery engine. The functionality of PIX is fully integrated into XQuery and permits a natural combination of XPath-based structure matching with phrase matching.
Divesh Srivastava, Mary F. Fernández, Sihem
Added 01 Nov 2009
Updated 01 Nov 2009
Type Conference
Year 2003
Where ICDE
Authors Divesh Srivastava, Mary F. Fernández, Sihem Amer-Yahia, Yu Xu
Comments (0)