Sciweavers

ICDE
2010
IEEE

Optimizing ETL workflows for fault-tolerance

13 years 9 months ago
Optimizing ETL workflows for fault-tolerance
Extract-Transform-Load (ETL) processes play an important role in data warehousing. Typically, design work on ETL has focused on performance as the sole metric to make sure that the ETL process finishes within an allocated time window. However, other quality metrics are also important and need to be considered during ETL design. In this paper, we address ETL design for performance plus fault-tolerance and freshness. There are many reasons why an ETL process can fail and a good design needs to guarantee that it can be recovered within the ETL time window. How to make ETL robust to failures is not trivial. There are different strategies that can be used and they each have different costs and benefits. In addition, other metrics can affect the choice of a strategy; e.g., higher freshness reduces the time window for recovery. The design space is too large for informal, ad-hoc approaches. In this paper, we describe our QoX optimizer that considers multiple design strategies and finds an ETL ...
Alkis Simitsis, Kevin Wilkinson, Umeshwar Dayal, M
Added 02 Aug 2010
Updated 02 Aug 2010
Type Conference
Year 2010
Where ICDE
Authors Alkis Simitsis, Kevin Wilkinson, Umeshwar Dayal, Malú Castellanos
Comments (0)