This paper takes a renewed look at the problem of managing intermediate data that is generated during dataflow computations (e.g., MapReduce, Pig, Dryad, etc.) within clouds. We d...
Steven Y. Ko, Imranul Hoque, Brian Cho, Indranil G...
Active storage clouds are an attractive platform for executing large data intensive workloads found in many fields of science. However, active storage presents new system managem...
Disk and network latency must be taken into account when applying parallel computing to large multidimensional datasets because they can hinder performance by reducing the rate at...
We describe our development work of tools for data collection and data management for wearable and ubiquitous computing. By embedding different kinds of sensory devices on the us...
Many areas of science are seeing a data deluge coming from new instruments, myriads of sensors and exponential growth in electronic records. We take two examples
Geoffrey Fox, Xiaohong Qiu, Scott Beason, Jong Y. ...