Sciweavers

128 search results - page 1 / 26
» Scaling up duplicate detection in graph data
Sort
View
CIKM
2008
Springer
13 years 6 months ago
Scaling up duplicate detection in graph data
Duplicate detection determines different representations of realworld objects in a database. Recent research has considered the use of relationships among object representations t...
Melanie Herschel, Felix Naumann
KDD
2012
ACM
271views Data Mining» more  KDD 2012»
11 years 7 months ago
GigaTensor: scaling tensor analysis up by 100 times - algorithms and discoveries
Many data are modeled as tensors, or multi dimensional arrays. Examples include the predicates (subject, verb, object) in knowledge bases, hyperlinks and anchor texts in the Web g...
U. Kang, Evangelos E. Papalexakis, Abhay Harpale, ...
ICDE
2003
IEEE
159views Database» more  ICDE 2003»
14 years 6 months ago
Scaling up the ALIAS Duplicate Elimination System
Duplicate elimination is an important stage in integrating data from multiple sources. The challenges involved are finding a robust deduplication function that can identify when t...
Sunita Sarawagi, Alok Kirpal
OOPSLA
2005
Springer
13 years 10 months ago
SDD: high performance code clone detection system for large scale source code
Code clones in software increase maintenance cost and lower software quality. We have devised a new algorithm to detect duplicated parts of source code in large software. Our algo...
Seunghak Lee, Iryoung Jeong
WWW
2008
ACM
14 years 5 months ago
Efficient similarity joins for near duplicate detection
With the increasing amount of data and the need to integrate data from multiple data sources, a challenging issue is to find near duplicate records efficiently. In this paper, we ...
Chuan Xiao, Wei Wang 0011, Xuemin Lin, Jeffrey Xu ...