Most programs are repetitive, where similar behavior can be seen at different execution times. Proposed on-line systems automatically group these similar intervals of execution in...
In this paper, we explore modeling overlapping biological processes. We discuss a probabilistic model of overlapping biological processes, gene membership in those processes, and ...
MapReduce is emerging as an important programming model for large-scale data-parallel applications such as web indexing, data mining, and scientific simulation. Hadoop is an open-...
Matei Zaharia, Andy Konwinski, Anthony D. Joseph, ...
Estimating frequency moments and Lp distances are well studied problems in the adversarial data stream model and tight space bounds are known for these two problems. There has been...
Our aim is to develop new database technologies for the approximate matching of unstructured string data using indexes. We explore the potential of the suffix tree data structure i...