Sciweavers

51 search results - page 1 / 11
» Handling Data Skew in MapReduce
Sort
View
CLOSER
2011
42views more  CLOSER 2011»
12 years 5 months ago
Handling Data Skew in MapReduce
Benjamin Gufler, Nikolaus Augsten, Angelika Reiser...
ICDE
2012
IEEE
216views Database» more  ICDE 2012»
11 years 8 months ago
Load Balancing in MapReduce Based on Scalable Cardinality Estimates
—MapReduce has emerged as a popular tool for distributed and scalable processing of massive data sets and is increasingly being used in e-science applications. Unfortunately, the...
Benjamin Gufler, Nikolaus Augsten, Angelika Reiser...
SIGMOD
2010
ACM
214views Database» more  SIGMOD 2010»
13 years 11 months ago
ParaTimer: a progress indicator for MapReduce DAGs
Time-oriented progress estimation for parallel queries is a challenging problem that has received only limited attention. In this paper, we present ParaTimer, a new type of timere...
Kristi Morton, Magdalena Balazinska, Dan Grossman
CLOUDCOM
2010
Springer
13 years 2 months ago
LEEN: Locality/Fairness-Aware Key Partitioning for MapReduce in the Cloud
This paper investigates the problem of Partitioning Skew1 in MapReduce-based system. Our studies with Hadoop, a widely used MapReduce implementation, demonstrate that the presence ...
Shadi Ibrahim, Hai Jin, Lu Lu, Song Wu, Bingsheng ...
CIKM
2011
Springer
12 years 6 months ago
Block-based load balancing for entity resolution with MapReduce
The effectiveness and scalability of MapReduce-based implementations of complex data-intensive tasks depend on an even redistribution of data between map and reduce tasks. In the...
Lars Kolb, Andreas Thor, Erhard Rahm