Sciweavers

ALMOB
2008

Noisy: Identification of problematic columns in multiple sequence alignments

13 years 4 months ago
Noisy: Identification of problematic columns in multiple sequence alignments
Motivation: Sequence-based methods for phylogenetic reconstruction from (nucleic acid) sequence data are notoriously plagued by two effects: homoplasies and alignment errors. Large evolutionary distances imply a large number of homoplastic sites. As most protein-coding genes show dramatic variations in substitution rates that are not uncorrelated across the sequence, this often leads to a patchwork pattern of (i) phylogenetically informative and (ii) effectively randomized regions. In highly variable regions, furthermore, alignment errors accumulate resulting in sometimes misleading signals in phylogenetic reconstruction. Results: We present here a method that, based on assessing the distribution of character states along a cyclic ordering of the taxa, allows the identification of phylogenetically uninformative homoplastic sites in a multiple sequence alignment. Removal of these sites appears to improve the performance of phylogenetic reconstruction algorithms as measured by various i...
Andreas W. M. Dress, Christoph Flamm, Guido Fritzs
Added 08 Dec 2010
Updated 08 Dec 2010
Type Journal
Year 2008
Where ALMOB
Authors Andreas W. M. Dress, Christoph Flamm, Guido Fritzsch, Stefan Grünewald, Matthias Kruspe, Sonja J. Prohaska, Peter F. Stadler
Comments (0)