Distributed data-parallel computing using a high-level programming language

9 years 9 months ago
Distributed data-parallel computing using a high-level programming language
The Dryad and DryadLINQ systems offer a new programming model for large scale data-parallel computing. They generalize previous execution environments such as SQL and MapReduce in three ways: by providing a general-purpose distributed execution engine for data-parallel applications; by adopting an expressive data model of strongly typed .NET objects; and by supporting general-purpose imperative and declarative operations on datasets within a traditional high-level programming language. A DryadLINQ program is a sequential program composed of LINQ expressions performing arbitrary side-effect-free operations on datasets, and can be written and debugged using standard .NET development tools. The DryadLINQ system automatically and transparently translates the data-parallel portions of the program into a distributed execution plan which is passed to the Dryad execution platform. Dryad, which has been in continuous operation for several years on production clusters made up of thousands of co...
Michael Isard, Yuan Yu
Added 05 Dec 2009
Updated 05 Dec 2009
Type Conference
Year 2009
Authors Michael Isard, Yuan Yu
Comments (0)