This paper establishes the theoretical framework of b-bit minwise hashing. The original minwise hashing method has become a standard technique for estimating set similarity (e.g.,...
In this article we propose a new hashing framework for tree-structured data. Our method maps an unordered tree into a multiset of simple wedge-shaped structures refered to as pivot...
We show that linear probing requires 5-independent hash functions for expected constant-time performance, matching an upper bound of [Pagh et al. STOC’07]. For (1 + ε)-approxima...
A locality sensitive hashing scheme is a distribution on a family F of hash functions operating on a collection of objects, such that for two objects x, y, PrhF [h(x) = h(y)] = si...
Abstract. We consider fingerprinting methods for collaborative filtering (CF) systems. In general, CF systems show their real strength when supplied with enormous data sets. Earl...