Extending Weighting Models with a Term Quality Measure

9 years 4 months ago
Extending Weighting Models with a Term Quality Measure
Abstract. Weighting models use lexical statistics, such as term frequencies, to derive term weights, which are used to estimate the relevance of a document to a query. Apart from the removal of stopwords, there is no other consideration of the quality of words that are being ‘weighted’. It is often assumed that term frequency is a good indicator for a decision to be made as to how relevant a document is to a query. Our intuition is that raw term frequency could be enhanced to better discriminate between terms. To do so, we propose using non-lexical features to predict the ‘quality’ of words, before they are weighted for retrieval. Specifically, we show how parts of speech (e.g. nouns, verbs) can help estimate how informative a word generally is, regardless of its relevance to a query/document. Experimental results with two standard TREC1 collections show that integrating the proposed term quality to two established weighting models enhances retrieval performance, over a baseli...
Christina Lioma, Iadh Ounis
Added 09 Jun 2010
Updated 09 Jun 2010
Type Conference
Year 2007
Authors Christina Lioma, Iadh Ounis
Comments (0)