Sciweavers

BNCOD
2003

Persistent Indexing Technology for Large Sequences

13 years 6 months ago
Persistent Indexing Technology for Large Sequences
There are two aspects to the work being presented here. The first is a novel persistent index structure for genomic data, a prototype of which has been completed. The second, using this index as an example, is a generic index development framework, which is under construction. We propose a variation of the suffix tree, the Top Compressed Suffix Tree, which has been designed to allow the on-disk construction of indexes over multi-gigabyte sequences. This form of the suffix tree extends the work of Hunt et al. [1] by improving the performance of the partitioned construction algorithm when the size of the sequence being indexed is comparable to that of the available main memory, and by providing a compact representation of the index on secondary memory. This work forms part of the GIDOF project—a project to provide a Generic Index Development and Operation Framework. GIDOF addresses the management of performance-critical parameters, automatic parameter exploration and tuning, and the p...
Robert Japp
Added 31 Oct 2010
Updated 31 Oct 2010
Type Conference
Year 2003
Where BNCOD
Authors Robert Japp
Comments (0)