Efficient Keyword Search for Smallest LCAs in XML Databases

12 years 5 months ago
Efficient Keyword Search for Smallest LCAs in XML Databases
Keyword search is a proven, user-friendly way to query HTML documents in the World Wide Web. We propose keyword search in XML documents, modeled as labeled trees, and describe corresponding efficient algorithms. The proposed keyword search returns the set of smallest trees containing all keywords, where a tree is designated as "smallest" if it contains no tree that also contains all keywords. Our core contribution, the Indexed Lookup Eager algorithm, exploits key properties of smallest trees in order to outperform prior algorithms by orders of magnitude when the query contains keywords with significantly different frequencies. The Scan Eager variant is tuned for the case where the keywords have similar frequencies. We analytically and experimentally evaluate two variants of the Eager algorithm, along with the Stack algorithm [13]. We also present the XKSearch system, which utilizes the Indexed Lookup Eager, Scan Eager and Stack algorithms and a demo of which on DBLP data is ...
Yu Xu, Yannis Papakonstantinou
Added 08 Dec 2009
Updated 08 Dec 2009
Type Conference
Year 2005
Authors Yu Xu, Yannis Papakonstantinou
Comments (0)