Sciweavers

DEXA
2005
Springer

An Optimal Skew-insensitive Join and Multi-join Algorithm for Distributed Architectures

13 years 10 months ago
An Optimal Skew-insensitive Join and Multi-join Algorithm for Distributed Architectures
Abstract. The development of scalable parallel database systems requires the design of efficient algorithms for the join operation which is the most frequent and expensive operation in relational database systems. The join is also the most vulnerable operation to data skew and to the high cost of communication in distributed architectures. In this paper, we present a new parallel algorithm for join and multijoin operations on distributed architectures based on an efficient semijoin computation technique. This algorithm is proved to have optimal complexity and deterministic perfect load balancing. Its tradeoff between balancing overhead and speedup is analyzed using the BSP cost model which predicts a negligible join product skew and a linear speed-up. This algorithm improves our fa join and sfa join algorithms by reducing their communication and synchronization cost to a minimum while offering the same load balancing properties even for highly skewed data.
Mostafa Bamha
Added 26 Jun 2010
Updated 26 Jun 2010
Type Conference
Year 2005
Where DEXA
Authors Mostafa Bamha
Comments (0)