Sciweavers

ALMOB
2006

On the maximal cliques in c-max-tolerance graphs and their application in clustering molecular sequences

13 years 4 months ago
On the maximal cliques in c-max-tolerance graphs and their application in clustering molecular sequences
Given a set S of n locally aligned sequences, it is a needed prerequisite to partition it into groups of very similar sequences to facilitate subsequent computations, such as the generation of a phylogenetic tree. This article introduces a new method of clustering which partitions S into subsets such that the overlap of each pair of sequences within a subset is at least a given percentage c of the lengths of the two sequences. We show that this problem can be reduced to finding all maximal cliques in a special kind of max-tolerance graph which we call a c-max-tolerance graph. Previously we have shown that finding all maximal cliques in general max-tolerance graphs can be done efficiently in O(n3 + out). Here, using a new kind of sweep-line algorithm, we show that the restriction to c-max-tolerance graphs yields a better runtime of O(n2 log n + out). Furthermore, we present another algorithm which is much easier to implement, and though theoretically slower than the first one, is still...
Katharina Anna Lehmann, Michael Kaufmann, Stephan
Added 10 Dec 2010
Updated 10 Dec 2010
Type Journal
Year 2006
Where ALMOB
Authors Katharina Anna Lehmann, Michael Kaufmann, Stephan Steigele, Kay Nieselt
Comments (0)