Sciweavers

BMCBI
2007

Large scale clustering of protein sequences with FORCE -A layout based heuristic for weighted cluster editing

13 years 4 months ago
Large scale clustering of protein sequences with FORCE -A layout based heuristic for weighted cluster editing
Background: Detecting groups of functionally related proteins from their amino acid sequence alone has been a long-standing challenge in computational genome research. Several clustering approaches, following different strategies, have been published to attack this problem. Today, new sequencing technologies provide huge amounts of sequence data that has to be efficiently clustered with constant or increased accuracy, at increased speed. Results: We advocate that the model of weighted cluster editing, also known as transitive graph projection is well-suited to protein clustering. We present the FORCE heuristic that is based on transitive graph projection and clusters arbitrary sets of objects, given pairwise similarity measures. In particular, we apply FORCE to the problem of protein clustering and show that it outperforms the most popular existing clustering tools (Spectral clustering, TribeMCL, GeneRAGE, Hierarchical clustering, and Affinity Propagation). Furthermore, we show that F...
Tobias Wittkop, Jan Baumbach, Francisco P. Lobo, S
Added 09 Dec 2010
Updated 09 Dec 2010
Type Journal
Year 2007
Where BMCBI
Authors Tobias Wittkop, Jan Baumbach, Francisco P. Lobo, Sven Rahmann
Comments (0)