Sciweavers

47 search results - page 1 / 10
» A grammar-based entity representation framework for data cle...
Sort
View
SIGMOD
2009
ACM
142views Database» more  SIGMOD 2009»
14 years 5 months ago
A grammar-based entity representation framework for data cleaning
Fundamental to data cleaning is the need to account for multiple data representations. We propose a formal framework that can be used to reason about and manipulate data represent...
Arvind Arasu, Raghav Kaushik
CAISE
2007
Springer
13 years 11 months ago
Declarative XML Data Cleaning with XClean
Data cleaning is the process of correcting anomalies in a data source, that may for instance be due to typographical errors, or duplicate representations of an entity. It is a cruc...
Melanie Weis, Ioana Manolescu
PVLDB
2010
98views more  PVLDB 2010»
13 years 3 months ago
On-the-Fly Entity-Aware Query Processing in the Presence of Linkage
Entity linkage is central to almost every data integration and data cleaning scenario. Traditional techniques use some computed similarity among data structure to perform merges a...
Ekaterini Ioannou, Wolfgang Nejdl, Claudia Nieder&...
SIGMOD
2009
ACM
177views Database» more  SIGMOD 2009»
14 years 5 months ago
Exploiting context analysis for combining multiple entity resolution systems
Entity Resolution (ER) is an important real world problem that has attracted significant research interest over the past few years. It deals with determining which object descript...
Zhaoqi Chen, Dmitri V. Kalashnikov, Sharad Mehrotr...
ICDE
2005
IEEE
108views Database» more  ICDE 2005»
13 years 10 months ago
Robust Identification of Fuzzy Duplicates
Detecting and eliminating fuzzy duplicates is a critical data cleaning task that is required by many applications. Fuzzy duplicates are multiple seemingly distinct tuples which re...
Surajit Chaudhuri, Venkatesh Ganti, Rajeev Motwani