Abstract. We present in this paper a new model for representing probabilistic information in a semi-structured (XML) database, based on the use of probabilistic event variables. Th...
Existing template-independent web data extraction approaches adopt highly ineffective decoupled strategies--attempting to do data record detection and attribute labeling in two se...
Research and products for the integration of heterogeneous legacy source databases in data warehousing have addressed numerous data quality problems in or between the sources. Suc...
Matthias Jarke, Christoph Quix, Guido Blees, Dirk ...