Sciweavers

BIBE
2010
IEEE

The Effect of Sequence Error and Partial Training Data on BLAST Accuracy

13 years 5 months ago
The Effect of Sequence Error and Partial Training Data on BLAST Accuracy
- Metagenomics is the study of environmental samples. Because few tools exist for metagenomic analysis, a natural step has been to utilize the popular homology tool, BLAST, to search for sequence similarity between DNA reads and an administered database. Most biologists use this method today without knowing BLAST's accuracy, especially when a particular taxonomic class is under-represented in the database. The aim of this paper is to benchmark the performance of BLAST for taxonomic classification of metagenomic datasets in a supervised setting; meaning that the database contains microbes of the same class as the `unknown' query DNA reads. We examine well- and underrepresented genera and phyla in order to study their effect on the accuracy of BLAST. We investigate the degradation in BLAST accuracy when genome coverage is reduced in the training database as well as the performance when errors are introduced into the query DNA reads. We conclude that on fine-resolution classes, ...
Steven D. Essinger, Gail L. Rosen
Added 08 Nov 2010
Updated 08 Nov 2010
Type Conference
Year 2010
Where BIBE
Authors Steven D. Essinger, Gail L. Rosen
Comments (0)