PUSH: A Dataflow Shell

13 years 6 months ago
PUSH: A Dataflow Shell
The deluge of huge data sets such as those provided by sensor networks, online transactions, and the web provide exciting opportunities for data analysis. The scale of the data makes it impossible to process in a reasonable amount of time on isolated machines. This has led to data flow systems emerging as the standard tool for solving research problems using these vast datasets. In typical dataflow systems, runtimes like Dryad [3] and Streamline [1] define graphs of processes, the edges of the graphs representing pipes, and their vertices representing computation. Within these run-times a new class of languages such as Sawzall [6] can be used by researchers to solve ”pleasantly parallel” problems (problems where the individual elements of datasets are considered to be independent of any other element) more quickly without worrying about explicit concurrency. These languages provide automated control flow (typically matched to the architecture of the underlying runtim...
Noah Evans, Eric Van Hensbergen
Added 08 Jun 2010
Updated 09 Jun 2010
Type Conference
Year 2010
Where Eurosys
Authors Noah Evans, Eric Van Hensbergen
Comments (0)