Sciweavers

RIAO
2007

Document frequency and term specificity

13 years 6 months ago
Document frequency and term specificity
Document frequency is used in various applications in Information Retrieval and other related fields. An assumption frequently made is that the document frequency represents a level of the term’s specificity. However, empirical results to support this assumption are limited. Therefore, a large-scale experiment was carried out, using multiple corpora, to gain further insight into the relationship between the document frequency and terms specificity. The results show that the assumption holds only at the very specific levels that cover the majority of vocabulary. The results also show that a larger corpus is more accurate at estimating the specificity. However, the co-occurrence information is shown to be effective for improving the accuracy when only a small corpus is available.
Hideo Joho, Mark Sanderson
Added 30 Oct 2010
Updated 30 Oct 2010
Type Conference
Year 2007
Where RIAO
Authors Hideo Joho, Mark Sanderson
Comments (0)