MapReduce has been prevalent for running data-parallel applications. By hiding other non-functionality parts such as parallelism, fault tolerance and load balance from programmers,...
Shengkai Zhu, Zhiwei Xiao, Haibo Chen, Rong Chen, ...
-- The MapReduce programming model, introduced by Google, has become popular over the past few years as a mechanism for processing large amounts of data, using sharednothing parall...
Sriram Krishnan, Chaitanya K. Baru, Christopher J....
—Fuzzy/similarity joins have been widely studied in the research community and extensively used in real-world applications. This paper proposes and evaluates several algorithms f...
Foto N. Afrati, Anish Das Sarma, David Menestrina,...
This paper describes the result of performance evaluation of two kinds of MapReduce applications running in the FutureGrid: a data intensive application and a computation intensive...
—In an attempt to increase the performance/cost ratio, large compute clusters are becoming heterogeneous at multiple levels: from asymmetric processors, to different system archi...