Decoupling storage and computation in Hadoop with SuperDataNodes

15 years 3 months ago

Download www.cs.cornell.edu

The rise of ad-hoc data-intensive computing has led to the development of data-parallel programming systems such as Map/Reduce and Hadoop, which achieve scalability by tightly coupling storage and computation. This can be limiting when the ratio of computation to storage is not known in advance, or changes over time. In this work, we examine decoupling storage and computation in Hadoop through SuperDataNodes, which are servers that contain an order of magnitude more disks than traditional Hadoop nodes. We found that SuperDataNodes are not only capable of supporting workloads with high storage-to-processing workloads, but in some cases can outperform traditional Hadoop deployments through better management of a large centralized pool of disks.

George Porter

Real-time Traffic

Hadoop | SIGOPS 2010 | Traditional Hadoop | Traditional Hadoop Nodes |

claim paper

Post Info
More Details (n/a)

Added	30 Jan 2011
Updated	30 Jan 2011
Type	Journal
Year	2010
Where	SIGOPS
Authors	George Porter

Comments (0)

Sciweavers

Decoupling storage and computation in Hadoop with SuperDataNodes

Hadoop | SIGOPS 2010 | Traditional Hadoop | Traditional Hadoop Nodes |

Explore & Download

Productivity Tools

Sciweavers