As online document collections continue to expand, both on the Web and in proprietary environments, the need for duplicate detection becomes more critical. The goal of this work i...
Previous research in novelty detection has focused on the task of finding novel material, given a set or stream of documents on a certain topic. This study investigates the more ...
We consider fast two-sided error-tolerant search that is robust against errors both on the query side (type alogrithm, find documents with algorithm) as well as on the document si...
—Identifying unusual or unique characteristics of an observed sample in useful in forensics in general and handwriting analysis in particular. Rarity is formulated as the probabi...
We describe an evaluation of result set filtering techniques for providing ultra-high precision in the task of presenting related news for general web queries. In this task, the n...
Steven M. Beitzel, Eric C. Jensen, Abdur Chowdhury...