Sciweavers

RECOMB
2009
Springer

Finding Biologically Accurate Clusterings in Hierarchical Tree Decompositions Using the Variation of Information

14 years 4 months ago
Finding Biologically Accurate Clusterings in Hierarchical Tree Decompositions Using the Variation of Information
Abstract. Hierarchical clustering is a popular method for grouping together similar elements based on a distance measure between them. In many cases, annotation information for some elements is known beforehand, which can aid the clustering process. We present a novel approach for decomposing a hierarchical clustering into the clusters that optimally match a set of known annotations, as measured by the variation of information metric. Our approach is general and does not require the user to enter the number of clusters desired. We apply it to two biological domains: finding protein complexes within protein interaction networks and identifying species within metagenomic DNA samples. For these two applications, we test the quality of our clusters by using them to predict complex and species membership, respectively. We find that our approach generally outperforms the commonly used heuristic methods. Key words: Hierarchical Tree Decompositions, Variation of Information, Clustering, Protei...
Saket Navlakha, James Robert White, Niranjan Nagar
Added 23 Nov 2009
Updated 23 Nov 2009
Type Conference
Year 2009
Where RECOMB
Authors Saket Navlakha, James Robert White, Niranjan Nagarajan, Mihai Pop, Carl Kingsford
Comments (0)