Structured data including sets, sequences, trees and graphs, pose significant challenges to fundamental aspects of data management such as efficient storage, indexing, and simila...
Xiaohong Wang, Aaron M. Smalter, Jun Huan, Gerald ...
We consider the problem of indexing high-dimensional data for answering (approximate) similarity-search queries. Similarity indexes prove to be important in a wide variety of sett...
We present a multi-dimensional indexing approach for fast sequence similarity search in DNA and protein databases. In particular, we propose effective transformations of subsequen...
To enable efficient similarity search in large databases, many indexing techniques use a linear transformation scheme to reduce dimensions and allow fast approximation. In this re...
Background: Similaritysearch in chemical structure databases is an important problem with many applications in chemical genomics, drug design, and efficient chemical probe screeni...
Xiaohong Wang, Jun Huan, Aaron M. Smalter, Gerald ...