The usefulness of the results produced by data mining methods can be critically impaired by several factors such as (1) low quality of data, including errors due to contamination, ...
Fang Chu, Yizhou Wang, Carlo Zaniolo, Douglas Stot...
Semi-structured data such as XML and HTML is attracting considerable attention. It is important to develop various kinds of data mining techniques that can handle semistructured d...
Abstract. We present a new distributed association rule mining (D-ARM) algorithm that demonstrates superlinear speed-up with the number of computing nodes. The algorithm is the fi...
This paper presents a study on the combination of different classifiers for toxicity prediction. Two combination operators for the Multiple-Classifier System definition are also pr...
Large scale production computing grids introduce new challenges in debugging and troubleshooting. A user that submits a workload consisting of tens of thousands of jobs to a grid ...