Experimental evidence on partitioning in parallel data warehouses

11 years 7 months ago
Experimental evidence on partitioning in parallel data warehouses
Parallelism can be used for major performance improvement in large Data warehouses (DW) with performance and scalability challenges. A simple low-cost shared-nothing architecture with horizontally fully-partitioned facts can be used to speedup response time of the data warehouse significantly. However, extra overheads related to processing large replicated relations and repartitioning requirements between nodes can significantly degrade speedup performance for many query patterns if special care is not taken during placement to minimize such overheads. In this paper we show these problems experimentally with the help of the performance evaluation benchmark TPC-H and identify simple modifications that can minimize such undesirable extra overheads. We analyze experimentally a simple and easy-to-apply partitioning and placement decision that achieves good performance improvement results. Categories and Subject Descriptors H.2.4 [Systems]: Parallel and Distributed Databases - retrieval mo...
Pedro Furtado
Added 30 Jun 2010
Updated 30 Jun 2010
Type Conference
Year 2004
Authors Pedro Furtado
Comments (0)