Making cloud intermediate data fault-tolerant

13 years 9 months ago
Making cloud intermediate data fault-tolerant
Parallel dataflow programs generate enormous amounts of distributed data that are short-lived, yet are critical for completion of the job and for good run-time performance. We call this class of data as intermediate data. This paper is the first to address intermediate data as a first-class citizen, specifically targeting and minimizing the effect of run-time server failures on the availability of intermediate data, and thus on performance metrics such as job completion time. We propose new design techniques for a new storage system called ISS (Intermediate Storage System), implement these techniques within Hadoop, and experimentally evaluate the resulting system. Under no failure, the performance of Hadoop augmented with ISS (i.e., job completion time) turns out to be comparable to base Hadoop. Under a failure, Hadoop with ISS outperforms base Hadoop and incurs up to 18% overhead compared to base no-failure Hadoop, depending on the testbed setup. Categories and Subject Descripto...
Steven Y. Ko, Imranul Hoque, Brian Cho, Indranil G
Added 10 Jul 2010
Updated 10 Jul 2010
Type Conference
Year 2010
Authors Steven Y. Ko, Imranul Hoque, Brian Cho, Indranil Gupta
Comments (0)