An optimization framework for map-reduce queries

8 years 5 months ago
An optimization framework for map-reduce queries
We present an effective optimization framework for general SQLlike map-reduce queries, which is based on a novel query algebra and uses a small number of higher-order physical operators that are directly implementable on existing map-reduce systems, such as Hadoop. Although our framework is applicable to any SQL-like map-reduce query language, we focus on a powerful query language, called MRQL. Current map-reduce query languages, such as HiveQL and PigLatin, enable users to plug-in custom map-reduce scripts into queries for those jobs that cannot be declaratively coded in the query language, which may result to suboptimal, error-prone, and hard-to-maintain code. In contrast to these languages, MRQL is expressive enough to capture most of these computations in declarative form and at the same time is amenable to optimization. We describe an optimization framework that maps the algebraic forms derived from the MRQL queries to efficient workflows of mapreduce operations that consist of...
Leonidas Fegaras, Chengkai Li, Upa Gupta
Added 29 Sep 2012
Updated 29 Sep 2012
Type Journal
Year 2012
Where EDBT
Authors Leonidas Fegaras, Chengkai Li, Upa Gupta
Comments (0)