Our aim is to develop new database technologies for the approximate matching of unstructured string data using indexes. We explore the potential of the suffix tree data structure i...
We have observed in recent years a growing interest in similarity search on large collections of biological sequences. Contributing to the interest, this paper presents a method fo...
A method is described for identification and classification of proteins encoded in large DNA sequences. Previously, an automated system was introduced for the general detection of...
The growing interest in genomic research has caused an explosive growth in the size of DNA databases making it increasely challenging to perform searches on them. In this paper, w...
Zhenqiang Tan, Xia Cao, Beng Chin Ooi, Anthony K. ...
We present a multi-dimensional indexing approach for fast sequence similarity search in DNA and protein databases. In particular, we propose effective transformations of subsequen...