Minkowski-sum cost model indicates that balanced data partitioning is not beneficial for high dimensional data. Thus we study several unbalanced partitioning methods and propose ...
Abstract. Searching in metric spaces is a very active field since it offers methods for indexing and searching by similarity in collections of unstructured data. These methods sele...
Data mining applications analyze large collections of set data and high dimensional categorical data. Search on these data types is not restricted to the classic problems of minin...
Similarity retrieval mechanisms should utilize generalized quadratic form distance functions as well as the Euclidean distance function since ellipsoid queries parameters may vary...
Similarity search leveraging distance-based index structures is increasingly being used for complex data types. It has been shown that for high dimensional uniform vectors with si...
Rui Mao, Wenguo Liu, Daniel P. Miranker, Qasim Iqb...