Sciweavers

DEXA
2009
Springer

A Versatile Record Linkage Method by Term Matching Model Using CRF

13 years 11 months ago
A Versatile Record Linkage Method by Term Matching Model Using CRF
We solve the problem of record linkage between databases where record fields are mixed and permuted in different ways. The solution method uses a conditional random fields model to find matching terms in record pairs and uses matching terms in the duplicate detection process. Although records with permuted fields may have partly reordered terms, our method can still utilize local orders of terms for finding matching terms. We carried out experiments on several wellknown data sets in record linkage research, and our method showed its advantages on most of the data sets. We also did experiments on a synthetic data set, in which records combined fields in random order, and verified that it could handle even this data set.
Quang Minh Vu, Atsuhiro Takasu, Jun Adachi
Added 26 May 2010
Updated 26 May 2010
Type Conference
Year 2009
Where DEXA
Authors Quang Minh Vu, Atsuhiro Takasu, Jun Adachi
Comments (0)