We propose a new method to build persistent suffix trees for indexing the genomic data. Our algorithm DiGeST (Disk-Based Genomic Suffix Tree) improves significantly over previous ...
Marina Barsky, Ulrike Stege, Alex Thomo, Chris Upt...
Approximate string matching on large DNA sequences data is very important in bioinformatics. Some studies have shown that suffix tree is an efficient data structure for approxim...
In various applications such as data cleansing, being able to retrieve categorical or numerical attributes based on notions of approximate match (e.g., edit distance, numerical di...
Liang Jin, Nick Koudas, Chen Li, Anthony K. H. Tun...
Querying and integrating sources of structured data from the Web in most cases requires similarity-based concepts to deal with data level conflicts. This is due to the often errone...
: Scalable Distributed Data Structures (SDDSs) store large scalable files over a distributed RAM of nodes in a grid or a P2P network. The files scale transparently for the applicat...