Locality Sensitive Hashing (LSH) is widely used for efficient retrieval of candidate matches in very large audio, video, and image systems. However, extremely large reference dat...
In this paper we study the privacy preservation properties of a specific technique for query log anonymization: tokenbased hashing. In this approach, each query is tokenized, and ...
Ravi Kumar, Jasmine Novak, Bo Pang, Andrew Tomkins
We study the performance issue of the “iterative” record linkage (RL) problem, where match and merge operations may occur together in iterations until convergence emerges. We ...
Most research on Distributed Hash Tables (DHTs) assumes ephemeral, lightly loaded deployments. Each node has a lifetime of a few hours and initiates a lookup once every few second...
We study selectivity estimation techniques for set similarity queries. A wide variety of similarity measures for sets have been proposed in the past. In this work we concentrate o...
Marios Hadjieleftheriou, Xiaohui Yu, Nick Koudas, ...