Sciweavers

ISMB
2000

Accelerating Protein Classification Using Suffix Trees

13 years 5 months ago
Accelerating Protein Classification Using Suffix Trees
Position-specific scoring matrices have been used extensively to recognize highly conserved protein regions. We present a method for accelerating these searches using a suffix tree data structure computed from the sequences to be searched. Building on earlier work that allows evaluation of a scoring matrix to be stopped early, the suffix tree-based method excludes many protein segments from consideration at once by pruning entire subtrees. Although suffix trees are usually expensive in space, the fact that scoring matrix evaluation requires an in-order traversal allows nodes to be stored more compactly without loss of speed, and our implementation requires only 17 bytes of primary memory per input symbol. Searches are accelerated by up to a factor of ten.
Bogdan Dorohonceanu, Craig G. Nevill-Manning
Added 01 Nov 2010
Updated 01 Nov 2010
Type Conference
Year 2000
Where ISMB
Authors Bogdan Dorohonceanu, Craig G. Nevill-Manning
Comments (0)