Document clustering techniques mostly rely on single term analysis of the document data set, such as the Vector Space Model. To better capture the structure of documents, the unde...
A general-purpose object indexingtechnique is described that combines the virtues of principal component analysis with the favorable matching properties of high-dimensional spaces...
This paper compares several indexing methods for person names extracted from text, developed for an information retrieval system with requirements for fast approximate matching of...
Genomic IR, characterized by its highly specific information need, severe synonym and polysemy problem, long term name and rapid growing literature size, is challenging IR communit...
Abstract. In this paper, we develop a content-based video classification approach to support semantic categorization, high-dimensional indexing and multi-level access. Our contribu...