Sciweavers

32 search results - page 2 / 7
» Near-duplicate detection for web-forums
Sort
View
WWW
2008
ACM
14 years 6 months ago
Efficient similarity joins for near duplicate detection
With the increasing amount of data and the need to integrate data from multiple data sources, a challenging issue is to find near duplicate records efficiently. In this paper, we ...
Chuan Xiao, Wei Wang 0011, Xuemin Lin, Jeffrey Xu ...
P2P
2010
IEEE
202views Communications» more  P2P 2010»
13 years 3 months ago
Optimizing Near Duplicate Detection for P2P Networks
—In this paper, we propose a probabilistic algorithm for detecting near duplicate text, audio, and video resources efficiently and effectively in large-scale P2P systems. To thi...
Odysseas Papapetrou, Sukriti Ramesh, Stefan Siersd...
SIGIR
2008
ACM
13 years 5 months ago
SpotSigs: robust and efficient near duplicate detection in large web collections
Motivated by our work with political scientists who need to manually analyze large Web archives of news sites, we present SpotSigs, a new algorithm for extracting and matching sig...
Martin Theobald, Jonathan Siddharth, Andreas Paepc...
GIS
2010
ACM
13 years 3 months ago
Detecting nearly duplicated records in location datasets
The quality of a local search engine, such as Google and Bing Maps, heavily relies on its geographic datasets. Typically, these datasets are obtained from multiple sources, e.g., ...
Yu Zheng, Xixuan Fen, Xing Xie, Shuang Peng, James...
MMM
2008
Springer
156views Multimedia» more  MMM 2008»
13 years 11 months ago
Cross-Lingual Retrieval of Identical News Events by Near-Duplicate Video Segment Detection
Recently, for reusing large quantities of accumulated news video, technology for news topic searching and tracking has become necessary. Moreover, since we need to understand a cer...
Akira Ogawa, Tomokazu Takahashi, Ichiro Ide, Hiros...