Clustering Genes Using Heterogeneous Data Sources

11 years 3 months ago
Clustering Genes Using Heterogeneous Data Sources
Clustering of gene expression data is a standard exploratory technique used to identify closely related genes. Many other sources of data are also likely to be of great assistance in the analysis of gene expression data. Such sources include proteinprotein interaction data, transcription factor and regulatory elements data, comparative genomics data, protein expression data and much more. These data provide us with a means to begin elucidating the large-scale modular organization of the cell. Conclusions drawn from more than one data source is likely to lead to new insights. Data sources may be complete or incomplete depending on whether or not they provide information about every gene in the genome. With a view toward a combined analysis of heterogeneous sources of data, we consider the challenging task of developing exploratory analytical techniques to deal with multiple complete and incomplete information sources. The Multi-Source Clustering (MSC) algorithm we developed performs cl...
Erliang Zeng, Chengyong Yang, Tao Li, Giri Narasim
Added 05 Mar 2011
Updated 05 Mar 2011
Type Journal
Year 2010
Authors Erliang Zeng, Chengyong Yang, Tao Li, Giri Narasimhan
Comments (0)