Sciweavers

48 search results - page 2 / 10
» Collection statistics for fast duplicate document detection
Sort
View
CIKM
2011
Springer
12 years 6 months ago
Partial duplicate detection for large book collections
A framework is presented for discovering partial duplicates in large collections of scanned books with optical character recognition (OCR) errors. Each book in the collection is r...
Ismet Zeki Yalniz, Ethem F. Can, R. Manmatha
SIGIR
2008
ACM
13 years 6 months ago
SpotSigs: robust and efficient near duplicate detection in large web collections
Motivated by our work with political scientists who need to manually analyze large Web archives of news sites, we present SpotSigs, a new algorithm for extracting and matching sig...
Martin Theobald, Jonathan Siddharth, Andreas Paepc...
BMCBI
2006
102views more  BMCBI 2006»
13 years 6 months ago
UVPAR: fast detection of functional shifts in duplicate genes
Background: The imprint of natural selection on gene sequences is often difficult to detect. A plethora of methods have been devised to detect genetic changes due to selective pro...
Vicente Arnau, Miguel Gallach, J. Ignasi Lucas, Ig...
ICMCS
2006
IEEE
188views Multimedia» more  ICMCS 2006»
14 years 6 days ago
Large-Scale Duplicate Detection for Web Image Search
Finding visually identical images in large image collections is important for many applications such as intelligence propriety protection and search result presentation. Several a...
Bin Wang, Zhiwei Li, Mingjing Li, Wei-Ying Ma
CIVR
2007
Springer
273views Image Analysis» more  CIVR 2007»
14 years 10 days ago
Scalable near identical image and shot detection
This paper proposes and compares two novel schemes for near duplicate image and video-shot detection. The first approach is based on global hierarchical colour histograms, using ...
Ondrej Chum, James Philbin, Michael Isard, Andrew ...