Fundamental to data cleaning is the need to account for multiple data representations. We propose a formal framework that can be used to reason about and manipulate data represent...
Data cleaning is the process of correcting anomalies in a data source, that may for instance be due to typographical errors, or duplicate representations of an entity. It is a cruc...
Entity linkage is central to almost every data integration and data cleaning scenario. Traditional techniques use some computed similarity among data structure to perform merges a...
Ekaterini Ioannou, Wolfgang Nejdl, Claudia Nieder&...
Entity Resolution (ER) is an important real world problem that has attracted significant research interest over the past few years. It deals with determining which object descript...
Zhaoqi Chen, Dmitri V. Kalashnikov, Sharad Mehrotr...
Detecting and eliminating fuzzy duplicates is a critical data cleaning task that is required by many applications. Fuzzy duplicates are multiple seemingly distinct tuples which re...