GridBatch: Cloud Computing for Large-Scale Data-Intensive Batch Applications

11 years 12 months ago
GridBatch: Cloud Computing for Large-Scale Data-Intensive Batch Applications
To be competitive, Enterprises are collecting and analyzing increasingly large amount of data in order to derive business insights. However, there are at least two challenges to meet the increasing demand. First, the growth in the amount of data far outpaces the computation power growth of a uniprocessor. The growing gap between the supply and demand of computation power forces Enterprises to parallelize their application code. Unfortunately, parallel programming is both time-consuming and error-prone. Second, the emerging Cloud Computing paradigm imposes constraints on the underlying infrastructure, which forces Enterprises to rethink their application architecture. We propose the GridBatch system, which aims at solving largescale data-intensive batch problems under the Cloud infrastructure constraints. GridBatch is a programming model and associated library that hides the complexity of parallel programming, yet it gives the users complete control on how data are partitioned and how ...
Huan Liu, Dan Orban
Added 29 May 2010
Updated 29 May 2010
Type Conference
Year 2008
Authors Huan Liu, Dan Orban
Comments (0)