Search Sciweavers | Sciweavers

26 search results - page 1 / 6

» Partial duplicate detection for large book collections

click to vote

CIKM
2011
Springer

191views Information Technology» more CIKM 2011»

Partial duplicate detection for large book collections

12 years 9 months ago

Download www.cs.umass.edu

A framework is presented for discovering partial duplicates in large collections of scanned books with optical character recognition (OCR) errors. Each book in the collection is r...

Ismet Zeki Yalniz, Ethem F. Can, R. Manmatha

claim paper

Read More »

click to vote

LREC
2008

130views Education» more LREC 2008»

Detecting Co-Derivative Documents in Large Text Collections

13 years 10 months ago

Download www.lrec-conf.org

We have analyzed the SPEX algorithm by Bernstein and Zobel (2004) for detecting co-derivative documents using duplicate n-grams. Although we totally agree with the claim that not ...

Jan Pomikálek, Pavel Rychlý

claim paper

Read More »

click to vote

SIGIR
2008
ACM

176views Information Technology» more SIGIR 2008»

SpotSigs: robust and efficient near duplicate detection in large web collections

13 years 9 months ago

Download ilpubs.stanford.edu

Motivated by our work with political scientists who need to manually analyze large Web archives of news sites, we present SpotSigs, a new algorithm for extracting and matching sig...

Martin Theobald, Jonathan Siddharth, Andreas Paepc...

claim paper

Read More »

click to vote

SIGIR
2010
ACM

169views Information Technology» more SIGIR 2010»

Efficient partial-duplicate detection based on sequence matching

13 years 4 months ago

Download homepage.fudan.edu.cn

With the ever-increasing growth of the Internet, numerous copies of documents become serious problem for search engine, opinion mining and many other web applications. Since parti...

Qi Zhang, Yue Zhang, Haomin Yu, Xuanjing Huang

claim paper

Read More »

click to vote

ICMCS
2006
IEEE

188views Multimedia» more ICMCS 2006»

Large-Scale Duplicate Detection for Web Image Search

14 years 3 months ago

Download www.cecs.uci.edu

Finding visually identical images in large image collections is important for many applications such as intelligence propriety protection and search result presentation. Several a...

Bin Wang, Zhiwei Li, Mingjing Li, Wei-Ying Ma

claim paper

Read More »

« Prev « First page 1 / 6 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers