Sciweavers

DMDW
2001

Improving Data Cleaning Quality Using a Data Lineage Facility

13 years 5 months ago
Improving Data Cleaning Quality Using a Data Lineage Facility
The problem of data cleaning, which consists of removing inconsistencies and errors from original data sets, is well known in the area of decision support systems and data warehouses. However, for some applications, existing ETL (Extraction Transformation Loading) and data cleaning tools for writing data cleaning programs are insufficient. One important challenge with them is the design of a data flow graph that effectively generates clean data. A generalized difficulty is the lack of explanation of cleaning results and user interaction facilities to tune a data cleaning program. This paper presents a solution to handle this problem by enabling users to express user interactions declaratively and tune data cleaning programs.
Helena Galhardas, Daniela Florescu, Dennis Shasha,
Added 31 Oct 2010
Updated 31 Oct 2010
Type Conference
Year 2001
Where DMDW
Authors Helena Galhardas, Daniela Florescu, Dennis Shasha, Eric Simon, Cristian-Augustin Saita
Comments (0)