Sciweavers

BMCBI
2008

GBParsy: A GenBank flatfile parser library with high speed

13 years 4 months ago
GBParsy: A GenBank flatfile parser library with high speed
Background: GenBank flatfile (GBF) format is one of the most popular sequence file formats because of its detailed sequence features and ease of readability. To use the data in the file by a computer, a parsing process is required and is performed according to a given grammar for the sequence and the description in a GBF. Currently, several parser libraries for the GBF have been developed. However, with the accumulation of DNA sequence information from eukaryotic chromosomes, parsing a eukaryotic genome sequence with these libraries inevitably takes a long time, due to the large GBF file and its correspondingly large genomic nucleotide sequence and related feature information. Thus, there is significant need to develop a parsing program with high speed and efficient use of system memory. Results: We developed a library, GBParsy, which was C language-based and parses GBF files. The parsing speed was maximized by using content-specified functions in place of regular expressions that are...
Tae-Ho Lee, Yeon-Ki Kim, Baek Hie Nahm
Added 08 Dec 2010
Updated 08 Dec 2010
Type Journal
Year 2008
Where BMCBI
Authors Tae-Ho Lee, Yeon-Ki Kim, Baek Hie Nahm
Comments (0)