Using Multiple Alignments to Improve Gene Prediction

14 years 6 days ago
Using Multiple Alignments to Improve Gene Prediction
The multiple species de novo gene prediction problem can be stated as follows: given an alignment of genomic sequences from two or more organisms, predict the location and structure of all protein-coding genes in one or more of the sequences. Here, we present a new system, NSCAN (a.k.a. TWINSCAN 3.0), for addressing this problem. N-SCAN has the ability to model dependencies between the aligned sequences, context-dependent substitution rates, and insertions and deletions in the sequences. An implementation of N-SCAN was created and used to generate predictions for the entire human genome. An analysis of the predictions reveals that N-SCAN's predictive accuracy in human exceeds that of all previously published whole-genome de novo gene predictors. In addition, predictions were generated for the genome of the fruit fly Drosophila melanogaster to demonstrate the applicability of N-SCAN to invertebrate gene prediction.
Samuel S. Gross, Michael R. Brent
Added 03 Dec 2009
Updated 03 Dec 2009
Type Conference
Year 2005
Authors Samuel S. Gross, Michael R. Brent
Comments (0)