Evaluation of clustering algorithms for gene expression data

12 years 11 months ago
Evaluation of clustering algorithms for gene expression data
Background: Cluster analysis is an integral part of high dimensional data analysis. In the context of large scale gene expression data, a filtered set of genes are grouped together according to their expression profiles using one of numerous clustering algorithms that exist in the statistics and machine learning literature. A closely related problem is that of selecting a clustering algorithm that is "optimal" in some sense from a rather impressive list of clustering algorithms that currently exist. Results: In this paper, we propose two validation measures each with two parts: one measuring the statistical consistency (stability) of the clusters produced and the other representing their biological functional congruence. Smaller values of these indices indicate better performance for a clustering algorithm. We illustrate this approach using two case studies with publicly available gene expression data sets: one involving a SAGE data of breast cancer patients and the other in...
Susmita Datta, Somnath Datta
Added 10 Dec 2010
Updated 10 Dec 2010
Type Journal
Year 2006
Authors Susmita Datta, Somnath Datta
Comments (0)