To find near-duplicate documents, fingerprint-based paradigms such as Broder's shingling and Charikar's simhash algorithms have been recognized as effective approaches a...
Web textual advertising can be interpreted as a search problem over the corpus of ads available for display in a particular context. In contrast to conventional information retrie...
Andrei Z. Broder, Massimiliano Ciaramita, Marcus F...
Although Locality-Sensitive Hashing (LSH) is a promising approach to similarity search in high-dimensional spaces, it has not been considered practical partly because its search q...
Wei Dong, Zhe Wang, William Josephson, Moses Chari...
Random projection (RP) is a common technique for dimensionality reduction under L2 norm for which many significant space embedding results have been demonstrated. However, many si...
Online detection of video clips that present previously unseen events in a video stream is still an open challenge to date. For this online new event detection (ONED) task, existi...