The increasing availability of electronic communication data, such as that arising from e-mail exchange, presents social and information scientists with new possibilities for char...
R. Dean Malmgren, Jake M. Hofman, Luis A. N. Amara...
We present a technique that masks failures in a cluster to provide high availability and fault-tolerance for long-running, parallelized dataflows. We can use these dataflows to im...
Mehul A. Shah, Joseph M. Hellerstein, Eric A. Brew...
In this paper we consider distributed K-Nearest Neighbor (KNN) search and range query processing in high dimensional data. Our approach is based on Locality Sensitive Hashing (LSH...
— In this paper, we present a design for a generic, open, application-oriented performance instrumentation of multitier applications. Measurements are performed through configur...
Markus Schmid, Marcus Thoss, Thomas Termin, Reinho...
Mining evolving data streams for concept drifts has gained importance in applications like customer behavior analysis, network intrusion detection, credit card fraud detection. Se...