One common predictive modeling challenge occurs in text mining problems is that the training data and the operational (testing) data are drawn from different underlying distributi...
Given a database with missing or uncertain content, our goal is to correct and fill the database by extracting specific information from a large corpus such as the Web, and to d...
The topic of managing uncertain data has been explored in many ways. Different methodologies for data storage and query processing have been proposed. As the availability of manag...
Peter Benjamin Volk, Frank Rosenthal, Martin Hahma...
In this paper, we proposed a new approach, called FiVaTech for the problem of Web data extraction. FiVaTech is a page-level data extraction system which deduces the data schema an...
Mohammed Kayed, Chia-Hui Chang, Khaled F. Shaalan,...
Abstract--Imbalanced data sets present a particular challenge to the data mining community. Often, it is the rare event that is of interest and the cost of misclassifying the rare ...