Given a metric space (X, dX), c ≥ 1, r > 0, and p, q ∈ [0, 1], a distribution over mappings H : X → N is called a (r, cr, p, q)-sensitive hash family if any two points in...
This paper proposes a query by humming method based on locality sensitive hashing (LSH). The method constructs an index of melodic fragments by extracting pitch vectors from a dat...
Similarity joins are important operations with a broad range of applications. In this paper, we study the problem of vector similarity join size estimation (VSJ). It is a generali...
Similarity search methods are widely used as kernels in various data mining and machine learning applications including those in computational biology, web search/clustering. Near...
We present a new Bi-level LSH algorithm to perform approximate k-nearest neighbor search in high dimensional spaces. Our formulation is based on a two-level scheme. In the first ...