Sciweavers

IPAW
2010

StarFlow: A Script-Centric Data Analysis Environment

13 years 2 months ago
StarFlow: A Script-Centric Data Analysis Environment
We introduce StarFlow, a script-centric environment for data analysis. StarFlow has four main features: (1) extraction of control and data-flow dependencies through a novel combination of static analysis, dynamic runtime analysis, and user annotations, (2) command-line tools for exploring and propagating changes through the resulting dependency (3) support for workflow abstractions enabling robust parallel executions of complex analysis pipelines, and (4) a seamless interface with the Python scripting language. We describe real applications of StarFlow, including automatic parallelization of complex workflows in the cloud. Key words: automatic parallelization, automatic updating, computational workflows, control flow, data-flow, data analysis, dependency trackvenance, Python, workflow abstraction
Elaine Angelino, Daniel Yamins, Margo I. Seltzer
Added 13 Feb 2011
Updated 13 Feb 2011
Type Journal
Year 2010
Where IPAW
Authors Elaine Angelino, Daniel Yamins, Margo I. Seltzer
Comments (0)