Sciweavers

TKDE
2011

Efficient Techniques for Online Record Linkage

12 years 11 months ago
Efficient Techniques for Online Record Linkage
—The need to consolidate the information contained in heterogeneous data sources has been widely documented in recent years. In order to accomplish this goal, an organization must resolve several types of heterogeneity problems, especially the entity heterogeneity problem that arises when the same real-world entity type is represented using different identifiers in different data sources. Statistical record linkage techniques could be used for resolving this problem. However, the use of such techniques for online record linkage could pose a tremendous communication bottleneck in a distributed environment (where entity heterogeneity problems are often encountered). In order to resolve this issue, we develop a matching tree, similar to a decision tree, and use it to propose techniques that reduce the communication overhead significantly, while providing matching decisions that are guaranteed to be the same as those obtained using the conventional linkage technique. These techniques hav...
Debabrata Dey, Vijay S. Mookerjee, Dengpan Liu
Added 15 May 2011
Updated 15 May 2011
Type Journal
Year 2011
Where TKDE
Authors Debabrata Dey, Vijay S. Mookerjee, Dengpan Liu
Comments (0)