Sciweavers

SIGIR
2010
ACM

How good is a span of terms?: exploiting proximity to improve web retrieval

13 years 8 months ago
How good is a span of terms?: exploiting proximity to improve web retrieval
Ranking search results is a fundamental problem in information retrieval. In this paper we explore whether the use of proximity and phrase information can improve web retrieval accuracy. We build on existing research by incorporating novel ranking features based on flexible proximity terms with recent state-of-the-art machine learning ranking models. We introduce a method of determining the goodness of a set of proximity terms that takes advantage of the structured nature of web documents, document metadata, and phrasal information from search engine user query logs. We perform experiments on a large real-world Web data collection and show that using the goodness score of flexible proximity terms can improve ranking accuracy over state-ofthe-art ranking methods by as much as 13%. We also show that we can improve accuracy on the hardest queries by as much as 9% relative to state-of-the-art approaches. Categories and Subject Descriptors H.3.3 [Information Storage and Retrieval]: Infor...
Krysta Marie Svore, Pallika H. Kanani, Nazan Khan
Added 16 Aug 2010
Updated 16 Aug 2010
Type Conference
Year 2010
Where SIGIR
Authors Krysta Marie Svore, Pallika H. Kanani, Nazan Khan
Comments (0)