Free Online Productivity Tools
i2Speak
i2Symbol
i2OCR
iTex2Img
iWeb2Print
iWeb2Shot
i2Type
iPdf2Split
iPdf2Merge
i2Bopomofo
i2Arabic
i2Style
i2Image
i2PDF
iLatex2Rtf
Sci2ools

KDD

2008

ACM

2008

ACM

: Locality Sensitive Hash functions are invaluable tools for approximate near neighbor problems in high dimensional spaces. In this work, we are focused on LSH schemes where the similarity metric is the cosine measure. The contribution of this work is a new class of locality sensitive hash functions for the cosine similarity measure based on the theory of concomitants, which arises in order statistics. Consider n i.i.d sample pairs, f(X1; Y1); (X2; Y2); : : : ; (Xn; Yn)g obtained from a bivariate distribution f(X; Y ). Concomitant theory captures the relation between the order statistics of X and Y in the form of a rank distribution given by Prob(Rank(Yi)=jjRank(Xi)=k). We exploit properties of the rank distribution towards developing a locality sensitive hash family that has excellent collision rate properties for the cosine measure. The computational cost of the basic algorithm is high for high hash lengths. We introduce several approximations based on the properties of concomitant o...

Added |
30 Nov 2009 |

Updated |
30 Nov 2009 |

Type |
Conference |

Year |
2008 |

Where |
KDD |

Authors |
Kave Eshghi, Shyamsundar Rajaram |

Comments (0)