This work addresses the need for stateful dataflow programs that can rapidly sift through huge, evolving data sets. These data-intensive applications perform complex multi-step c...
Dionysios Logothetis, Christopher Olston, Benjamin...
Background: Manual curation of biological databases, an expensive and labor-intensive process, is essential for high quality integrated data. In this paper we report the implement...
The MapReduce framework is increasingly being used to analyze large volumes of data. One important type of data analysis done with MapReduce is log processing, in which a click-st...
Spyros Blanas, Jignesh M. Patel, Vuk Ercegovac, Ju...
Heterogeneous information networks that contain multiple types of objects and links are ubiquitous in the real world, such as bibliographic networks, cyber-physical networks, and ...
Intelligence analysts grapple with many challenges, chief among them is the need for software support in storytelling, i.e., automatically ‘connecting the dots’ between dispar...
M. Shahriar Hossain, Patrick Butler, Arnold P. Boe...