The objective of entity identification is to determine the correspondence between object instances from more than one database. This paper ezamines the problem at the instance lev...
The World-Wide Web consists of a huge number of unstructured documents, but it also contains structured data in the form of HTML tables. We extracted 14.1 billion HTML tables from...
Michael J. Cafarella, Alon Y. Halevy, Daisy Zhe Wa...
The DTD of a set of XML documents may change due to many reasons such as changes to the real world events, changes to the user’s requirements, and mistakes in the initial design....
Erwin Leonardi, Tran T. Hoai, Sourav S. Bhowmick, ...
In this paper, we describe and situate the TUPELO system for data mapping in relational databases. Automating the discovery of mappings between structured data sources is a long st...
Abstract. The notion of user preference in database modeling has recently received much attention in advanced applications, such as personalization of e-services, since it captures...