Addressed in this paper is the issue of `email data cleaning' for text mining. Many text mining applications need take emails as input. Email data is usually noisy and thus i...
We study an algorithm for feature selection that clusters attributes using a special metric and then makes use of the dendrogram of the resulting cluster hierarchy to choose the m...
Richard Butterworth, Gregory Piatetsky-Shapiro, Da...
We propose a new system to mine visual knowledge on the Web. There are huge image data as well as text data on the Web. However, mining image data from the Web is paid less attent...
Homeland security measures are increasing the amount of data collected, processed and mined. At the same time, owners of the data raised legitimate concern about their privacy and...
Anapplication of data miningtechniques to heterogeneous database schemaintegration is introduced. We use attribute-oriented induction to minefor characteristic and classification ...