Our aim is to develop new database technologies for the approximate matching of unstructured string data using indexes. We explore the potential of the suffix tree data structure i...
In genomic sequence analysis tasks like splice site recognition or promoter identification, large amounts of training sequences are available, and indeed needed to achieve suffici...
We study suitable indexing techniques to support efficient exact match search in large biological sequence databases. We propose a suffix tree (ST) representation, called STA-DF, ...
Mihail Halachev, Nematollaah Shiri, Anand Thamildu...
Finding motifs in biological sequences is one of the most intriguing problems for string algorithms designers due to, on the one hand, the numerous applications of this problem in...