Abstract. This paper presents a general technique for optimally transforming any dynamic data structure D that operates on atomic and indivisible keys by constant-time comparisons,...
We compare different statistical characterizations of a set of strings, for three different histogram-based distances. Given a distance, a set of strings may be characterized by it...
: Scalable Distributed Data Structures (SDDSs) store large scalable files over a distributed RAM of nodes in a grid or a P2P network. The files scale transparently for the applicat...
Today's record matching infrastructure does not allow a flexible way to account for synonyms such as "Robert" and "Bob" which refer to the same name, and ...
Edit distance based string similarity join is a fundamental operator in string databases. Increasingly, many applications in data cleaning, data integration, and scientific compu...