Manimal: Relational Optimization for Data-Intensive Programs

10 years 11 months ago
Manimal: Relational Optimization for Data-Intensive Programs
The MapReduce distributed programming framework is very popular, but currently lacks the optimization techniques that have been standard with relational database systems for many years. This paper proposes Manimal, which uses static code analysis to detect MapReduce program semantics and thereby enable wholly-automatic optimization of MapReduce programs. For example, a programmer’s map function that emits data only when an if... statement holds true is essentially encoding a selection condition; code analysis can detect and characterize these conditions. If Manimal has an appropriate index available, it can then alter MapReduce execution to use it. Manimal can address many different optimization opportunities, including projections, structure-aware data compression, and others. However, this paper illustrates the system by focusing on one: efficient selection. We give a static analysis algorithm that can detect selections in user programs, and cover how Manimal can employ a B+Tree ...
Michael J. Cafarella, Christopher Ré
Added 11 Jul 2010
Updated 11 Jul 2010
Type Conference
Year 2010
Authors Michael J. Cafarella, Christopher Ré
Comments (0)