Processing content-oriented XPath queries

12 years 5 months ago
Processing content-oriented XPath queries
Document-centric XML collections contain text-rich documents, marked up with XML tags that add lightweight semantics to the text. Querying such collections calls for a hybrid query language: the text-rich nature of the documents suggests a content-oriented (IR) approach, while the mark-up allows users to add structural constraints to their IR queries. Hybrid queries tend to be more expressive, which should lead—in principle—to better retrieval performance. In practice, the processing of these hybrid queries within an IR systems turns out to be far from trivial, because a delicate balance between structural and content information needs to be sought. We propose an approach to processing such hybrid content-and-structure queries that decomposes a query into multiple content-only queries whose results are then combined in ways determined by the structural constraints of the original query. We evaluate our methods using the INEX 2003 test-suite, and show (1) that effective ways of pr...
Börkur Sigurbjörnsson, Jaap Kamps, Maart
Added 01 Jul 2010
Updated 01 Jul 2010
Type Conference
Year 2004
Where CIKM
Authors Börkur Sigurbjörnsson, Jaap Kamps, Maarten de Rijke
Comments (0)