Sciweavers

WEBDB
2009
Springer

A Machine Learning Approach to Foreign Key Discovery

13 years 11 months ago
A Machine Learning Approach to Foreign Key Discovery
We study the problem of automatically discovering semantic associations between schema elements, namely foreign keys. This problem is important in all applications where data sets need to be integrated that are structured in tables but without explicit foreign key constraints. If such constraints could be recovered automatically, querying and integrating such databases would become much easier. Clearly, one may find candidates for foreign key constraints in a given database instance by computing all inclusion dependencies (IND) between attributes. However, this set usually contains many false positives due to spurious set inclusions. We present a machine learning approach to tackle this problem. We first compute all INDs of a given schema and let each be judged by a binary classification algorithm using a small set of features that can be derived efficiently using standard SQL. We demonstrate the feasibility of this approach using crossvalidation with several state-of-the-art classifi...
Alexandra Rostin, Oliver Albrecht, Jana Bauckmann,
Added 25 May 2010
Updated 25 May 2010
Type Conference
Year 2009
Where WEBDB
Authors Alexandra Rostin, Oliver Albrecht, Jana Bauckmann, Felix Naumann, Ulf Leser
Comments (0)