Web data integration is an important preprocessing step for web mining. It is highly likely that several records on the web whose textual representations differ may represent the ...
We study the design issues of data-centric XML documents where (1) there are no mixed contents, i.e., each element may have some subelements and attributes, or it may have a singl...
Abstract. We give a general framework for approximate query processing in semistructured databases. We focus on regular path queries, which are the integral part of most of the que...
We consider the problem of learning a record matching package (classifier) in an active learning setting. In active learning, the learning algorithm picks the set of examples to ...
Traditional database technology may be extended by taking advantage of the facilities of an integrated persistent programming environment. This paper focuses on how such an enviro...
Richard C. H. Connor, Quintin I. Cutts, Graham N. ...