Scalable similarity search with optimized kernel hashing

14 years 7 months ago

Download www.ee.columbia.edu

Scalable similarity search is the core of many large scale learning or data mining applications. Recently, many research results demonstrate that one promising approach is creating compact and eﬃcient hash codes that preserve data similarity. By eﬃcient, we refer to the low correlation (and thus low redundancy) among generated codes. However, most existing hash methods are designed only for vector data. In this paper, we develop a new hashing algorithm to create eﬃcient codes for large scale data of general formats with any kernel function, including kernels on vectors, graphs, sequences, sets and so on. Starting with the idea analogous to spectral hashing, novel formulations and solutions are proposed such that a kernel based hash function can be explicitly represented and optimized, and directly applied to compute compact hash codes for new samples of general formats. Moreover, we incorporate eﬃcient techniques, such as Nystr¨om approximation, to further reduce time and spa...

Junfeng He, Wei Liu, Shih-Fu Chang

Real-time Traffic

Data Mining | General Formats | Hash Codes | KDD 2010 | Large Scale |

claim paper

Post Info
More Details (n/a)

Added	29 Jan 2011
Updated	29 Jan 2011
Type	Journal
Year	2010
Where	KDD
Authors	Junfeng He, Wei Liu, Shih-Fu Chang

Comments (0)

Sciweavers

Scalable similarity search with optimized kernel hashing

Data Mining | General Formats | Hash Codes | KDD 2010 | Large Scale |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers