We consider the problem of indexing high-dimensional data for answering (approximate) similarity-search queries. Similarity indexes prove to be important in a wide variety of sett...
A function on n variables is called a k-junta if it depends on at most k of its variables. In this article, we show that it is possible to test whether a function is a k-junta or ...
The problem of similarity search (query-by-content) has attracted much research interest. It is a difficult problem because of the inherently high dimensionality of the data. The ...
Given a set of model graphs D and a query graph q, containment search aims to find all model graphs g D such that q contains g (q g). Due to the wide adoption of graph models, f...
Chen Chen, Xifeng Yan, Philip S. Yu, Jiawei Han, D...
Sketching techniques can provide approximate answers to aggregate queries either for data-streaming or distributed computation. Small space summaries that have linearity propertie...