Sciweavers

11 search results - page 1 / 3
» Efficient Semantic-Aware Detection of Near Duplicate Resourc...
Sort
View
ESWS
2010
Springer
13 years 7 months ago
Efficient Semantic-Aware Detection of Near Duplicate Resources
Abstract. Efficiently detecting near duplicate resources is an important task when integrating information from various sources and applications. Once detected, near duplicate reso...
Ekaterini Ioannou, Odysseas Papapetrou, Dimitrios ...
WWW
2008
ACM
14 years 5 months ago
Efficient similarity joins for near duplicate detection
With the increasing amount of data and the need to integrate data from multiple data sources, a challenging issue is to find near duplicate records efficiently. In this paper, we ...
Chuan Xiao, Wei Wang 0011, Xuemin Lin, Jeffrey Xu ...
SIGIR
2008
ACM
13 years 4 months ago
SpotSigs: robust and efficient near duplicate detection in large web collections
Motivated by our work with political scientists who need to manually analyze large Web archives of news sites, we present SpotSigs, a new algorithm for extracting and matching sig...
Martin Theobald, Jonathan Siddharth, Andreas Paepc...
P2P
2010
IEEE
202views Communications» more  P2P 2010»
13 years 2 months ago
Optimizing Near Duplicate Detection for P2P Networks
—In this paper, we propose a probabilistic algorithm for detecting near duplicate text, audio, and video resources efficiently and effectively in large-scale P2P systems. To thi...
Odysseas Papapetrou, Sukriti Ramesh, Stefan Siersd...
MM
2009
ACM
249views Multimedia» more  MM 2009»
13 years 9 months ago
MyFinder: near-duplicate detection for large image collections
The explosive growth of multimedia data poses serious challenges to data storage, management and search. Efficient near-duplicate detection is one of the required technologies for...
Xin Yang, Qiang Zhu, Kwang-Ting Cheng