Similarity search and similarity join on strings are important for applications such as duplicate detection, error detection, data cleansing, or comparison of biological sequences....
Background: During gene expression analysis by Serial Analysis of Gene Expression (SAGE), duplicate ditags are routinely removed from the data analysis, because they are suspected...
Jeppe Emmersen, Anna M. Heidenblut, Annabeth Laurs...
This paper proposes and compares two novel schemes for near duplicate image and video-shot detection. The first approach is based on global hierarchical colour histograms, using ...
Ondrej Chum, James Philbin, Michael Isard, Andrew ...
We consider a pilot-assisted interleave-division multiple access (IDMA) system transmitting over block-fading channels. We describe this system in terms of a factor graph and use ...
We study the fundamental problem of computing distances between nodes in large graphs such as the web graph and social networks. Our objective is to be able to answer distance que...
Atish Das Sarma, Sreenivas Gollapudi, Marc Najork,...