Sciweavers

29 search results - page 1 / 6
» Active Learning Genetic programming for record deduplication
Sort
View
CEC
2010
IEEE
13 years 2 months ago
Active Learning Genetic programming for record deduplication
The great majority of genetic programming (GP) algorithms that deal with the classification problem follow a supervised approach, i.e., they consider that all fitness cases availab...
Junio de Freitas, Gisele L. Pappa, Altigran Soares...
ICDE
2003
IEEE
159views Database» more  ICDE 2003»
14 years 5 months ago
Scaling up the ALIAS Duplicate Elimination System
Duplicate elimination is an important stage in integrating data from multiple sources. The challenges involved are finding a robust deduplication function that can identify when t...
Sunita Sarawagi, Alok Kirpal
KDD
2002
ACM
93views Data Mining» more  KDD 2002»
14 years 4 months ago
Interactive deduplication using active learning
Deduplication is a key operation in integrating data from multiple sources. The main challenge in this task is designing a function that can resolve when a pair of records refer t...
Sunita Sarawagi, Anuradha Bhamidipaty
VLDB
2002
ACM
126views Database» more  VLDB 2002»
13 years 4 months ago
ALIAS: An Active Learning led Interactive Deduplication System
Deduplication, a key operation in integrating data from multiple sources, is a time-consuming, labor-intensive and domainspecific operation. We present our design of alias that us...
Sunita Sarawagi, Anuradha Bhamidipaty, Alok Kirpal...
IQ
2003
13 years 5 months ago
ClueMaker: A Language for Approximate Record Matching
We introduce ClueMaker, the first language designed specifically for approximate record matching. Clues written in ClueMaker predict whether two records denote the same thing based...
Martin Buechi, Andrew Borthwick, Adam Winkel, Arth...