Document quality models for web ad hoc retrieval

9 years 3 months ago
Document quality models for web ad hoc retrieval
The quality of document content, which is an issue that is usually ignored for the traditional ad hoc retrieval task, is a critical issue for Web search. Web pages have a huge variation in quality relative to, for example, newswire articles. To address this problem, we propose a document quality language model approach that is incorporated into the basic query likelihood retrieval model in the form of a prior probability. Our results demonstrate that, on average, the new model is significantly better than the baseline (query likelihood model) in terms of MRR and precision at the top ranks. We also give a detailed query analysis which provides some interesting insights on the limitations of the quality model and the relationship between document quality and relevance. Categories and Subject Descriptors H.3.3 [Information Storage and Retrieval]: Information Search and Retrieval General Terms Experimentation Keywords Document quality, prior probabilities, collection-document distance, we...
Yun Zhou, W. Bruce Croft
Added 26 Jun 2010
Updated 26 Jun 2010
Type Conference
Year 2005
Where CIKM
Authors Yun Zhou, W. Bruce Croft
Comments (0)