— In many fields, e.g. decision-making, numerical values in [0,1] are available and one is often interested in detecting which are similar. In this paper, we propose an operator...
Current data repositories include a variety of data types, including audio, images and time series. State of the art techniques for indexing such data and doing query processing r...
Abstract. We show that eigenvector decomposition can be used to extract a term taxonomy from a given collection of text documents. So far, methods based on eigenvector decompositio...
Holger Bast, Georges Dupret, Debapriyo Majumdar, B...
We consider the problem of estimating CPU (distance computations) and I/O costs for processing range and k-nearest neighbors queries over metric spaces. Unlike the specific case ...
This paper addresses the problem of similar image retrieval, especially in the setting of large-scale datasets with millions to billions of images. The core novel contribution is ...