Automatic Data Fusion with HumMer

15 years 5 months ago

Download www.vldb.org

Heterogeneous and dirty data is abundant. It is stored under diﬀerent, often opaque schemata, it represents identical real-world objects multiple times, causing duplicates, and it has missing values and conﬂicting values. The Humboldt Merger (HumMer) is a tool that allows ad-hoc, declarative fusion of such data using a simple extension to SQL. Guided by a query against multiple tables, HumMer proceeds in three fully automated steps: First, instance-based schema matching bridges schematic heterogeneity of the tables by aligning corresponding attributes. Next, duplicate detection techniques ﬁnd multiple representations of identical real-world objects. Finally, data fusion and conﬂict resolution merges duplicates into a single, consistent, and clean representation. 1 Fusing Heterogeneous, Duplicate, and Conﬂicting Data The task of fusing data involves the solution of many diﬀerent problems, each one in itself formidable: Apart from the technical challenges of accessing remote...

Alexander Bilke, Jens Bleiholder, Christoph Bö

Real-time Traffic

Database | Identical Real-world Objects | Resolution Merges Duplicates | Schematic Heterogeneity | VLDB 2005 |

claim paper

» Data and Decision Level Fusion of Temporal Information for Automatic Target Recognition

» Sensor and Classifier Fusion for Outdoor Obstacle Detection an Application of Data Fusion ...

» TimeEfficient Creation of an Accurate Sentence Fusion Corpus

» Exploiting multimodal data fusion in robust speech recognition

» KnoFusius a New Knowledge Fusion System for Interpretation of Gene Expression Data

» Automatic Extraction of CoarseGrained DataFlow Threads from Imperative Programs

» Information Fusion for Entity Matching in Unstructured Data

» Hierarchical Fusion of Multiple Classifiers for Hyperspectral Data Analysis

Post Info
More Details (n/a)

Added	28 Jun 2010
Updated	28 Jun 2010
Type	Conference
Year	2005
Where	VLDB
Authors	Alexander Bilke, Jens Bleiholder, Christoph Böhm, Karsten Draba, Felix Naumann, Melanie Weis

Comments (0)

Sciweavers

Automatic Data Fusion with HumMer

Database | Identical Real-world Objects | Resolution Merges Duplicates | Schematic Heterogeneity | VLDB 2005 |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers