Search Sciweavers | Sciweavers

24 search results - page 1 / 5

» Detecting nearly duplicated records in location datasets

179

click to vote

GIS
2010
ACM

312views Automated Reasoning» more GIS 2010»

Detecting nearly duplicated records in location datasets

15 years 8 days ago

Download research.microsoft.com

The quality of a local search engine, such as Google and Bing Maps, heavily relies on its geographic datasets. Typically, these datasets are obtained from multiple sources, e.g., ...

Yu Zheng, Xixuan Fen, Xing Xie, Shuang Peng, James...

claim paper

Read More »

121

click to vote

WWW
2008
ACM

214views Internet Technology» more WWW 2008»

16 years 2 months ago

Efficient similarity joins for near duplicate detection

Download www2008.org

With the increasing amount of data and the need to integrate data from multiple data sources, a challenging issue is to find near duplicate records efficiently. In this paper, we ...

Chuan Xiao, Wei Wang 0011, Xuemin Lin, Jeffrey Xu ...

claim paper

Read More »

click to vote

SIGMOD
2010
ACM

269views Database» more SIGMOD 2010»

MapDupReducer: detecting near duplicates over massive datasets

15 years 1 months ago

Download www.cse.unsw.edu.au

Categories and Subject Descriptors General Terms Keywords

Chaokun Wang, Jianmin Wang, Xuemin Lin, Wei Wang, ...

claim paper

Read More »

102

click to vote

P2P
2010
IEEE

202views Communications» more P2P 2010»

Optimizing Near Duplicate Detection for P2P Networks

15 years 4 days ago

Download www.l3s.de

—In this paper, we propose a probabilistic algorithm for detecting near duplicate text, audio, and video resources efﬁciently and effectively in large-scale P2P systems. To thi...

Odysseas Papapetrou, Sukriti Ramesh, Stefan Siersd...

claim paper

Read More »

129

click to vote

KDD
2003
ACM

214views Data Mining» more KDD 2003»

Adaptive duplicate detection using learnable string similarity measures

16 years 2 months ago

Download www.cs.utexas.edu

The problem of identifying approximately duplicate records in databases is an essential step for data cleaning and data integration processes. Most existing approaches have relied...

Mikhail Bilenko, Raymond J. Mooney

claim paper

Read More »

« Prev « First page 1 / 5 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers