Sciweavers

IRAL
2003
ACM

Keyword-based document clustering

13 years 9 months ago
Keyword-based document clustering
1 Document clustering is an aggregation of related documents to a cluster based on the similarity evaluation task between documents and the representatives of clusters. Terms and their discriminating features of terms are the clue to the clustering and the discriminating features are based on the term and document frequencies. Feature selection method on the basis of frequency statistics has a limitation to the enhancement of the clustering algorithm because it does not consider the contents of the cluster objects. In this paper, we adopt a content-based analytic approach to refine the similarity computation and propose a keyword-based clustering algorithm. Experimental results show that content-based keyword weighting outperforms frequency-based weighting method.
Seung-Shik Kang
Added 05 Jul 2010
Updated 05 Jul 2010
Type Conference
Year 2003
Where IRAL
Authors Seung-Shik Kang
Comments (0)