Sciweavers

KDD
2007
ACM

Dynamic hybrid clustering of bioinformatics by incorporating text mining and citation analysis

14 years 5 months ago
Dynamic hybrid clustering of bioinformatics by incorporating text mining and citation analysis
To unravel the concept structure and dynamics of the bioinformatics field, we analyze a set of 7401 publications from the Web of Science and MEDLINE databases, publication years 1981?2004. For delineating this complex, interdisciplinary field, a novel bibliometric retrieval strategy is used. Given that the performance of unsupervised clustering and classification of scientific publications is significantly improved by deeply merging textual contents with the structure of the citation graph, we proceed with a hybrid clustering method based on Fisher's inverse chi-square. The optimal number of clusters is determined by a compound semiautomatic strategy comprising a combination of distancebased and stability-based methods. We also investigate the relationship between number of Latent Semantic Indexing factors, number of clusters, and clustering performance. The HITS and PageRank algorithms are used to determine representative publications in each cluster. Next, we develop a methodol...
Bart De Moor, Frizo A. L. Janssens, Wolfgang Gl&au
Added 30 Nov 2009
Updated 30 Nov 2009
Type Conference
Year 2007
Where KDD
Authors Bart De Moor, Frizo A. L. Janssens, Wolfgang Glänzel
Comments (0)