Provenance for nested subqueries

11 years 6 months ago
Provenance for nested subqueries
Data provenance is essential in applications such as scientific computing, curated databases, and data warehouses. Several systems have been developed that provide provenance functionality for the relational data model. These systems support only a small subset of SQL, a severe limitation in practice since most of the application domains that benefit from provenance information use complex queries. Such queries typically involve nested subqueries, aggregation and/or user defined functions. Without support for these constructs, a provenance management system is of limited use. In this paper we address this limitation by exploring the problem of provenance derivation when complex queries are involved. More precisely, we demonstrate that the widely used definition of Why-provenance fails in the presence of nested subqueries, and show how the definition can be modified to produce meaningful results for nested subqueries. We further present query rewrite rules to transform an SQL que...
Boris Glavic, Gustavo Alonso
Added 24 Jul 2010
Updated 24 Jul 2010
Type Conference
Year 2009
Where EDBT
Authors Boris Glavic, Gustavo Alonso
Comments (0)