Sciweavers

RECOMB
2005
Springer

Information Theoretic Approaches to Whole Genome Phylogenies

14 years 4 months ago
Information Theoretic Approaches to Whole Genome Phylogenies
We describe a novel method for efficient reconstruction of phylogenetic trees, based on sequences of whole genomes or proteomes, whose lengths may greatly vary. The core of our method is a new measure of pairwise distances between sequences. This measure is based on computing the average lengths of maximum common substrings. It is intrinsically related to information theoretic tools (Kullback-Leibler relative entropy). We present an algorithm for efficiently computing these distances. In principle, the distance of two long sequences can be calculated in O( ) time. We implemented the algorithm, using suffix arrays. The implementation Is fast enough to enable the construction of the proteome phylogenomic tree for hundreds of species, and the genome phylogenomic forest for almost two thousand viruses. An initial analysis of the results exhibits a remarkable agreement with "acceptable phylogenetic and taxonomic truth". To assess our approach, it was compared to the traditional (...
David Burstein, Igor Ulitsky, Tamir Tuller, Benny
Added 03 Dec 2009
Updated 03 Dec 2009
Type Conference
Year 2005
Where RECOMB
Authors David Burstein, Igor Ulitsky, Tamir Tuller, Benny Chor
Comments (0)