Large-Scale Collective Entity Matching

10 years 6 months ago
Large-Scale Collective Entity Matching
There have been several recent advancements in Machine Learning community on the Entity Matching (EM) problem. However, their lack of scalability has prevented them from being applied in practical settings on large real-life datasets. Towards this end, we propose a principled framework to scale any generic EM algorithm. Our technique consists of running multiple instances of the EM algorithm on small neighborhoods of the data and passing messages across neighborhoods to construct a global solution. We prove formal properties of our framework and experimentally demonstrate the effectiveness of our approach in scaling EM algorithms.
Vibhor Rastogi, Nilesh N. Dalvi, Minos N. Garofala
Added 13 May 2011
Updated 13 May 2011
Type Journal
Year 2011
Where CORR
Authors Vibhor Rastogi, Nilesh N. Dalvi, Minos N. Garofalakis
Comments (0)