Sciweavers

DMDW
2001
128views Management» more  DMDW 2001»
13 years 5 months ago
Improving Data Cleaning Quality Using a Data Lineage Facility
The problem of data cleaning, which consists of removing inconsistencies and errors from original data sets, is well known in the area of decision support systems and data warehou...
Helena Galhardas, Daniela Florescu, Dennis Shasha,...
KDD
2000
ACM
145views Data Mining» more  KDD 2000»
13 years 8 months ago
IntelliClean: a knowledge-based intelligent data cleaner
Existing data cleaning methods work on the basis of computing the degree of similarity between nearby records in a sorted database. High recall is achieved by accepting records wi...
Mong-Li Lee, Tok Wang Ling, Wai Lup Low
EDBTW
2006
Springer
13 years 8 months ago
Preference-Driven Querying of Inconsistent Relational Databases
One of the goals of cleaning an inconsistent database is to remove conflicts between tuples. Typically, the user specifies how the conflicts should be resolved. Sometimes this spec...
Slawomir Staworko, Jan Chomicki, Jerzy Marcinkowsk...
VLDB
2001
ACM
108views Database» more  VLDB 2001»
13 years 9 months ago
Potter's Wheel: An Interactive Data Cleaning System
Cleaning data of errors in structure and content is important for data warehousing and integration. Current solutions for data cleaning involve many iterations of data “auditing...
Vijayshankar Raman, Joseph M. Hellerstein
ICEIS
2003
IEEE
13 years 9 months ago
Data Position and Profiling in Domain-Independent Warehouse Cleaning
: A major problem that arises from integrating different databases is the existence of duplicates. Data cleaning is the process for identifying two or more records within the datab...
Christie I. Ezeife, Ajumobi Udechukwu
VLDB
2004
ACM
120views Database» more  VLDB 2004»
13 years 9 months ago
Merging the Results of Approximate Match Operations
Data Cleaning is an important process that has been at the center of research interest in recent years. An important end goal of effective data cleaning is to identify the relatio...
Sudipto Guha, Nick Koudas, Amit Marathe, Divesh Sr...
DEXA
2004
Springer
147views Database» more  DEXA 2004»
13 years 10 months ago
A Flexible Fuzzy Expert System for Fuzzy Duplicate Elimination in Data Cleaning
Data cleaning deals with the detection and removal of errors and inconsistencies in data, gathered from distributed sources. This process is essential for drawing correct conclusio...
Hamid Haidarian Shahri, Ahmad Abdollahzadeh Barfor...
IQIS
2005
ACM
13 years 10 months ago
Exploiting relationships for object consolidation
Researchers in the data mining area frequently have to spend significant portion of their time on preprocessing the data in order to apply their algorithms to real-world datasets...
Zhaoqi Chen, Dmitri V. Kalashnikov, Sharad Mehrotr...
DIS
2007
Springer
13 years 10 months ago
Machine Learning in Ecosystem Informatics
Ecosystem Informatics brings together mathematical and computational tools to address scientific and policy challenges in the ecosystem sciences. These challenges include novel s...
Thomas G. Dietterich
CAISE
2007
Springer
13 years 10 months ago
Declarative XML Data Cleaning with XClean
Data cleaning is the process of correcting anomalies in a data source, that may for instance be due to typographical errors, or duplicate representations of an entity. It is a cruc...
Melanie Weis, Ioana Manolescu