Sciweavers

SDM
2010
SIAM

Fast and Accurate Gene Prediction by Decision Tree Classification

13 years 6 months ago
Fast and Accurate Gene Prediction by Decision Tree Classification
Gene prediction is one of the most challenging tasks in genome analysis, for which many tools have been developed and are still evolving. In this paper, we present a novel gene prediction method that is both fast and accurate, by making use of protein homology and decision tree classification. Specifically, we apply the principled entropy and decision tree concepts to assist in such gene prediction process. Our goal is to resolve the exact gene structures in terms of finding "coding" regions (exons) and "non-coding" regions (introns). Unlike traditional classification tasks, however, we do not have explicit class labels for such structures in the genes. We use protein sequence (the product of gene) as a query to help in finding genes that are homologous to the query protein and deduce class labels based on homology. Our experiments on the genomes of two nematodes C. elegans and C. briggsae show that in addition to achieving prediction accuracy comparable with that ...
Rong She, Jeffrey Shih-Chieh Chu, Ke Wang, Nanshen
Added 29 Oct 2010
Updated 29 Oct 2010
Type Conference
Year 2010
Where SDM
Authors Rong She, Jeffrey Shih-Chieh Chu, Ke Wang, Nansheng Chen
Comments (0)