Learning Noun Phrase Query Segmentation

9 years 2 months ago
Learning Noun Phrase Query Segmentation
Query segmentation is the process of taking a user’s search-engine query and dividing the tokens into individual phrases or semantic units. Identification of these query segments can potentially improve both document-retrieval precision, by first returning pages which contain the exact query segments, and document-retrieval recall, by allowing query expansion or substitution via the segmented units. We train and evaluate a machine-learned query segmentation system that achieves 86% segmentationdecision accuracy on a gold standard set of segmented noun phrase queries, well above recently published approaches. Key enablers of this high performance are features derived from previous natural language processing work in noun compound bracketing. For example, token association features beyond simple N-gram counts provide powerful indicators of segmentation.
Shane Bergsma, Qin Iris Wang
Added 29 Oct 2010
Updated 29 Oct 2010
Type Conference
Year 2007
Authors Shane Bergsma, Qin Iris Wang
Comments (0)