Sciweavers

SIGOPS
2010

Decoupling storage and computation in Hadoop with SuperDataNodes

13 years 3 months ago
Decoupling storage and computation in Hadoop with SuperDataNodes
The rise of ad-hoc data-intensive computing has led to the development of data-parallel programming systems such as Map/Reduce and Hadoop, which achieve scalability by tightly coupling storage and computation. This can be limiting when the ratio of computation to storage is not known in advance, or changes over time. In this work, we examine decoupling storage and computation in Hadoop through SuperDataNodes, which are servers that contain an order of magnitude more disks than traditional Hadoop nodes. We found that SuperDataNodes are not only capable of supporting workloads with high storage-to-processing workloads, but in some cases can outperform traditional Hadoop deployments through better management of a large centralized pool of disks.
George Porter
Added 30 Jan 2011
Updated 30 Jan 2011
Type Journal
Year 2010
Where SIGOPS
Authors George Porter
Comments (0)