Model order selection for bio-molecular data clustering

8 years 11 months ago
Model order selection for bio-molecular data clustering
Background: Cluster analysis has been widely applied for investigating structure in bio-molecular data. A drawback of most clustering algorithms is that they cannot automatically detect the ”natural” number of clusters underlying the data, and in many cases we have no enough ”a priori” biological knowledge to evaluate both the number of clusters as well as their validity. Recently several methods based on the concept of stability have been proposed to estimate the ”optimal” number of clusters, but despite their successful application to the analysis of complex bio-molecular data, the assessment of the statistical significance of the discovered clustering solutions and the detection of multiple structures simultaneously present in high-dimensional bio-molecular data are still major problems. Results: We propose a stability method based on randomized maps that exploits the high-dimensionality and relatively low cardinality that characterize bio-molecular data, by selecting ...
Alberto Bertoni, Giorgio Valentini
Added 12 Dec 2010
Updated 12 Dec 2010
Type Journal
Year 2007
Authors Alberto Bertoni, Giorgio Valentini
Comments (0)