We present a declarative framework for collective deduplication of entity references in the presence of constraints. Constraints occur naturally in many data cleaning domains and c...
Auditing the changes to a database is critical for identifying malicious behavior, maintaining data quality, and improving system performance. But an accurate audit log is a histor...
Receiver operating characteristics (ROC) curves have the property that they start at (0,l) and end at (1,O) and are monotonically decreasing. Furthermore, a parametric representat...
The nature of semistructured data in web collections is evolving. Increasingly, XML web documents (or documents exchanged via web services) are valid with regard to a schema, yet ...
Mariano P. Consens, Flavio Rizzolo, Alejandro A. V...
Modern information systems often store data that has been transformed and integrated from a variety of sources. This integration may obscure the original source semantics of data ...