Sciweavers

6 search results - page 1 / 2
» Iterative record linkage for cleaning and integration
Sort
View
DMKD
2004
ACM
139views Data Mining» more  DMKD 2004»
13 years 10 months ago
Iterative record linkage for cleaning and integration
Record linkage, the problem of determining when two records refer to the same entity, has applications for both data cleaning (deduplication) and for integrating data from multipl...
Indrajit Bhattacharya, Lise Getoor
KDD
2008
ACM
176views Data Mining» more  KDD 2008»
14 years 4 months ago
Febrl -: an open source data cleaning, deduplication and record linkage system with a graphical user interface
Matching records that refer to the same entity across databases is becoming an increasingly important part of many data mining projects, as often data from multiple sources needs ...
Peter Christen
CIKM
2007
Springer
13 years 10 months ago
Parallel linkage
We study the parallelization of the (record) linkage problem – i.e., to identify matching records between two collections of records, A and B. One of main idiosyncrasies of the ...
Hung-sik Kim, Dongwon Lee
VLDB
2001
ACM
108views Database» more  VLDB 2001»
13 years 9 months ago
Potter's Wheel: An Interactive Data Cleaning System
Cleaning data of errors in structure and content is important for data warehousing and integration. Current solutions for data cleaning involve many iterations of data “auditing...
Vijayshankar Raman, Joseph M. Hellerstein
KDD
2003
ACM
214views Data Mining» more  KDD 2003»
14 years 4 months ago
Adaptive duplicate detection using learnable string similarity measures
The problem of identifying approximately duplicate records in databases is an essential step for data cleaning and data integration processes. Most existing approaches have relied...
Mikhail Bilenko, Raymond J. Mooney