Sciweavers

78
Voted
CIKM
2006
Springer

Describing differences between databases

15 years 16 days ago
Describing differences between databases
We study the novel problem of efficiently computing the update distance for a pair of relational databases. In analogy to the edit distance of strings, we define the update distance of two databases as the minimal number of set-oriented insert, delete and modification operations necessary to transform one database into the other. We show how this distance can be computed by traversing a search space of database instances connected by update operations. This insight leads to a family of algorithms that compute the update distance or approximations of it. In our experiments we observed that a simple heuristic performs surprisingly well in most considered cases. Our motivation for studying distance measures for databases stems from the field of scientific databases. There, replicas of a single database are often maintained at different sites, which typically leads to (accidental or planned) divergence of their content. To re-create a consistent view, these differences must be resolved. S...
Heiko Müller, Johann Christoph Freytag, Ulf L
Added 20 Aug 2010
Updated 20 Aug 2010
Type Conference
Year 2006
Where CIKM
Authors Heiko Müller, Johann Christoph Freytag, Ulf Leser
Comments (0)