We examine the behavioral patterns of email usage in a large-scale enterprise over a three-month period. In particular, we focus on two main questions: (Q1) what do replies depend...
We initiate a systematic study of a special type of property testers. These testers consist of repeating a basic test for a number of times that depends on the proximity parameter...
The paper presents distributed and parallel -approximation algorithms for covering problems, where is the maximum number of variables on which any constraint depends (for example...
Wikipedia is an example of the large, collaborative, semi-structured data sets emerging on the Web. Typically, before these data sets can be used, they must transformed into struc...
Data lineage and data provenance are key to the management of scientific data. Not knowing the exact provenance and processing pipeline used to produce a derived data set often re...