Ibis: A Provenance Manager for Multi-Layer Systems

12 years 1 months ago
Ibis: A Provenance Manager for Multi-Layer Systems
End-to-end data processing environments are often comprised of several independently-developed (sub-)systems, e.g. for engineering, organizational or historical reasons. Unfortunately this situation harms usability. For one thing, systems created independently tend to have disparate capabilities in terms of what metadata is retained and how it can be queried. If something goes wrong it can be very difficult to trace execution histories across the various sub-systems. One solution is to ship each sub-system’s metadata to a central metadata manager that integrates it and offers a powerful and uniform query interface. This paper describes a metadata manager we are building, called Ibis. Perhaps the greatest challenge in this context is dealing with data provenance queries in the presence of mixed granularities of metadata—e.g. rows vs. column groups vs. tables; mapreduce job slices vs. relational operators—supplied by different sub-systems. The central contribution of our work is ...
Christopher Olston, Anish Das Sarma
Added 25 Aug 2011
Updated 25 Aug 2011
Type Journal
Year 2011
Where CIDR
Authors Christopher Olston, Anish Das Sarma
Comments (0)