Sciweavers

CLOUDCOM
2010
Springer

LEEN: Locality/Fairness-Aware Key Partitioning for MapReduce in the Cloud

13 years 1 months ago
LEEN: Locality/Fairness-Aware Key Partitioning for MapReduce in the Cloud
This paper investigates the problem of Partitioning Skew1 in MapReduce-based system. Our studies with Hadoop, a widely used MapReduce implementation, demonstrate that the presence of partitioning skew causes a huge amount of data transfer during the shuffle phase and leads to significant unfairness on the reduce input among different data nodes. As a result, the applications experience performance degradation due to the long data transfer during the shuffle phase along with the computation skew, particularly in reduce phase. We develop a novel algorithm named LEEN for locality-aware and fairness-aware key partitioning in MapReduce. LEEN embraces an asynchronous map and reduce scheme. All buffered intermediate keys are partitioned according to their frequencies and the fairness of the expected data distribution after the shuffle phase. We have integrated LEEN into Hadoop-0.18.0. Our experiments demonstrate that LEEN can efficiently achieve higher locality and reduce the amount of shuffl...
Shadi Ibrahim, Hai Jin, Lu Lu, Song Wu, Bingsheng
Added 21 Mar 2011
Updated 21 Mar 2011
Type Journal
Year 2010
Where CLOUDCOM
Authors Shadi Ibrahim, Hai Jin, Lu Lu, Song Wu, Bingsheng He, Li Qi
Comments (0)