Effective, design-independent XML keyword search

12 years 1 months ago
Effective, design-independent XML keyword search
Keyword search techniques that take advantage of XML structure make it very easy for ordinary users to query XML databases, but current approaches to processing these queries rely on intuitively appealing heuristics that are ultimately ad hoc. These approaches often retrieve irrelevant answers, overlook relevant answers, and cannot rank answers appropriately. To address these problems for data-centric XML, we propose coherency ranking (CR), a domain- and database design-independent ranking method for XML keyword queries that is based on an extension of the concept of mutual information. With CR, the results of a keyword query are invariant under schema reorganization. We analyze how previous approaches to XML keyword search approximate CR, and present efficient algorithms to perform CR. Our empirical evaluation with 65 user-supplied queries over two real-world XML data sets shows that CR has better precision and recall and provides better ranking than all previous approaches. Categori...
Arash Termehchy, Marianne Winslett
Added 02 Sep 2010
Updated 02 Sep 2010
Type Conference
Year 2009
Where CIKM
Authors Arash Termehchy, Marianne Winslett
Comments (0)