Highly Scalable and Accurate Seeds for Subsequence Alignment

12 years 5 months ago
Highly Scalable and Accurate Seeds for Subsequence Alignment
We propose a method for finding seeds for the local alignment of two nucleotide sequences. Our method uses randomized algorithms to find approximate seeds. We present a dynamic index to store the fingerprints of k-grams and a highly scalable and accurate (HSA) algorithm to incorporate randomization into process of seed generation. Experimental results show that our method produces better quality seeds with improved running time and memory usage compared to traditional non-spaced and spaced seeds. The presented algorithm scales very well with higher seed lengths while maintaining the quality and performance. 1 Motivation Locating similar subsequences between a query sequence and the sequences in a database is one of the most fundamental problems in bioinformatics. This is also known as the local alignment problem. Local alignment matches pairs of letters between two subsequences. A score is then assigned for each match. Every mismatch and gap are penalized with appropriate mismatch,...
Abhijit Pol, Tamer Kahveci
Added 24 Jun 2010
Updated 24 Jun 2010
Type Conference
Year 2005
Where BIBE
Authors Abhijit Pol, Tamer Kahveci
Comments (0)