Practical Lineage Tracing in Data Warehouses

9 years 8 months ago
Practical Lineage Tracing in Data Warehouses
We consider the view data lineage problem in a warehousing environment: For a given data item in a materialized warehouse view, we want to identify the set of source data items that produced the view item. We formalize the problem, and we present a lineage tracing algorithm for relational views with aggregation. Based on our tracing algorithm, we propose a number of schemes for storing auxiliary views that enable consistent and e cient lineage tracing in a multisource data warehouse. We report on a performance study of the various schemes, identifying which schemes perform best in which settings. Based on our results, we have implemented a lineage tracing package in the WHIPS data warehousing system prototype at Stanford. With this package, users can select view tuples of interest, then e ciently drill through" to examine the exact source tuples that produced the view tuples of interest.
Yingwei Cui, Jennifer Widom
Added 31 Jul 2010
Updated 31 Jul 2010
Type Conference
Year 2000
Where ICDE
Authors Yingwei Cui, Jennifer Widom
Comments (0)