Sciweavers

JMMA
2010

Co-clustering Documents and Words by Minimizing the Normalized Cut Objective Function

13 years 4 months ago
Co-clustering Documents and Words by Minimizing the Normalized Cut Objective Function
This paper follows a word-document co-clustering model independently introduced in 2001 by several authors such as I.S. Dhillon, H. Zha and C. Ding. This model consists in creating a bipartite graph based on word frequencies in documents, and whose vertices are both documents and words. The created bipartite graph is then partitioned in a way that minimizes the normalized cut objective function to produce the document clustering. The fusion-fission graph partitioning metaheuristic is applied on several document collections using this word-document co-clustering model. Results demonstrate a real problem in this model: partitions found almost always have a normalized cut value lowest than the original document collection clustering. Moreover, measures of the goodness of solutions seem to be relatively independent of the normalized cut values of partitions. Keywords Graph partitioning, normalized cut, fusion-fission, document clustering, information retrieval, data mining
Charles-Edmond Bichot
Added 19 May 2011
Updated 19 May 2011
Type Journal
Year 2010
Where JMMA
Authors Charles-Edmond Bichot
Comments (0)