Discovering Frequent Agreement Subtrees from Phylogenetic Data

9 years 9 months ago
Discovering Frequent Agreement Subtrees from Phylogenetic Data
We study a new data mining problem concerning the discovery of frequent agreement subtrees (FASTs) from a set of phylogenetic trees. A phylogenetic tree, or phylogeny, is an unordered tree in which the order among siblings is unimportant. Furthermore, each leaf in the tree has a label representing a taxon (species or organism) name, whereas internal nodes are unlabeled. The tree may have a root, representing the common ancestor of all species in the tree, or may be unrooted. An unrooted phylogeny arises due to the lack of sufficient evidence to infer a common ancestor of the taxa in the tree. The FAST problem addressed here is a natural extension of the maximum agreement subtree (MAST) problem widely studied in the computational phylogenetics community. The paper establishes a framework for tackling the FAST problem for both rooted and unrooted phylogenetic trees using data mining techniques. We first develop a novel canonical form for rooted trees together with a phylogeny-aware tree ...
Sen Zhang, Jason Tsong-Li Wang
Added 15 Dec 2010
Updated 15 Dec 2010
Type Journal
Year 2008
Where TKDE
Authors Sen Zhang, Jason Tsong-Li Wang
Comments (0)