When multiple data sources are available for clustering, an a priori data integration process is usually required. This process may be costly and may not lead to good clusterings,...
Elisa Boari de Lima, Raquel Cardoso de Melo Minard...
Random data perturbation (RDP) has been in use for several years in statistical databases and public surveys as a means of providing privacy to individuals while collecting informa...
We create a support system for predicting end prices on eBay. The end price predictions are based on the item descriptions found in the item listings of eBay, and on some numerica...
Dennis van Heijst, Rob Potharst, Michiel C. van We...
The appropriate choice of a method for imputation of missing data becomes especially important when the fraction of missing values is large and the data are of mixed type. The prop...
Vadim V. Ayuyev, Joseph Jupin, Philip W. Harris, Z...
In solving the classification problem in relational data mining, traditional methods, for example, the C4.5 and its variants, usually require data transformations from datasets sto...