Sciweavers

BMCBI
2007

Applying negative rule mining to improve genome annotation

13 years 4 months ago
Applying negative rule mining to improve genome annotation
Background: Unsupervised annotation of proteins by software pipelines suffers from very high error rates. Spurious functional assignments are usually caused by unwarranted homology-based transfer of information from existing database entries to the new target sequences. We have previously demonstrated that data mining in large sequence annotation databanks can help identify annotation items that are strongly associated with each other, and that exceptions from strong positive association rules often point to potential annotation errors. Here we investigate the applicability of negative association rule mining to revealing erroneously assigned annotation items. Results: Almost all exceptions from strong negative association rules are connected to at least one wrong attribute in the feature combination making up the rule. The fraction of annotation features flagged by this approach as suspicious is strongly enriched in errors and constitutes about 0.6% of the whole body of the similarit...
Irena I. Artamonova, Goar Frishman, Dmitrij Frishm
Added 08 Dec 2010
Updated 08 Dec 2010
Type Journal
Year 2007
Where BMCBI
Authors Irena I. Artamonova, Goar Frishman, Dmitrij Frishman
Comments (0)