Sciweavers

CLOUDI
2013

Cache conscious star-join in MapReduce environments

10 years 1 days ago
Cache conscious star-join in MapReduce environments
With the popularity of big data and cloud computing, data parallel framework MapReduce based data warehouse systems are used widely. Column store is a default data placement in these systems. Traditionally star join is a core operation in the data warehouse. However, little related work study star join in column store and MapReduce environments. This paper proposes two new cache conscious algorithms Multi-Fragment-Replication Join (MFRJ) and MapReduce-Invisible Join (MRIJ) in MapReduce environments. All these algorithms avoid fact table data movement and are cache conscious in each MapReduce node. In addition, fact table is partitioned into several column groups for cache optimization in MFRJ; One group contains all of foreign key columns and each measure column is a group. In MRIJ, each column is separately processed one by one which has higher cache utilization and avoids frequently cache miss from one column to the other column. MRIJ is composed of several map operation on dimensio...
Guoliang Zhou, Yongli Zhu, Guilan Wang
Added 27 Apr 2014
Updated 27 Apr 2014
Type Journal
Year 2013
Where CLOUDI
Authors Guoliang Zhou, Yongli Zhu, Guilan Wang
Comments (0)