With the proliferation of multimedia data, there is increasing need to support the indexing and searching of high dimensional data. Recently, a vector approximation based techniqu...
Abstract. Nearest neighbor search has a wide variety of applications. Unfortunately, the majority of search methods do not scale well with dimensionality. Recent efforts have been ...
Most existing semi-supervised learning methods are based on the smoothness assumption that data points in the same high density region should have the same label. This assumption, ...
Nearest Neighbor search is an important and widely used problem in a number of important application domains. In many of these domains, the dimensionality of the data representati...
We consider the problem of joining massive datasets. We propose two techniques for minimizing disk I/O cost of join operations for both spatial and sequence data. Our techniques o...