Strict consistency of replicated data is infeasible or not required by many distributed applications, so current systems often permit stale replication, in which cached copies of ...
— Uncertainties in data arise for a number of reasons: when the data set is incomplete, contains conflicting information or has been deliberately perturbed or coarsened to remov...
Graham Cormode, Divesh Srivastava, Entong Shen, Ti...
Now motivated also by the partial support of major search engines, hundreds of millions of documents are being published on the web embedding semi-structured data in RDF, RDFa and ...
Abstract. We present in this paper a new model for representing probabilistic information in a semi-structured (XML) database, based on the use of probabilistic event variables. Th...
The detection of duplicate tuples, corresponding to the same real-world entity, is an important task in data integration and cleaning. While many techniques exist to identify such...