Keyword query cleaning using hidden Markov models

14 years 7 months ago
Keyword query cleaning using hidden Markov models
In this paper, we consider the problem of keyword query cleaning for structured databases from a probabilistic approach. Keyword query cleaning consists of rewriting the user query, segmenting the keywords, matching each segment to database items, and finally tagging the segments by their meta-data information. We present an efficient and robust solution using Hidden Markov Models (HMM). By modeling user keyword queries using a generative probabilistic HMM-based model, we construct a HMM from the user specified keyword query (and the database instance). The optimal statistical keyword cleaning is computed as the most likely path of the constructed HMM. Furthermore, we demonstrate how the optimal HMM-based keyword cleaning algorithm can be generalized to compute a stream of clean queries ranked from the most likely clean query to the least likely clean query. Finally, we present the implementation of the proposed system and its preliminary performance. Categories and Subject Descriptor...
Ken Q. Pu
Added 05 Dec 2009
Updated 05 Dec 2009
Type Conference
Year 2009
Authors Ken Q. Pu
Comments (0)