Addressed in this paper is the issue of `email data cleaning' for text mining. Many text mining applications need take emails as input. Email data is usually noisy and thus i...
In this paper we propose a new information-theoretic divisive algorithm for word clustering applied to text classification. In previous work, such "distributional clustering&...
Inderjit S. Dhillon, Subramanyam Mallela, Rahul Ku...
When automatically extracting information from the world wide web, most established methods focus on spotting single HTMLdocuments. However, the problem of spotting complete web s...
Martin Ester, Hans-Peter Kriegel, Matthias Schuber...
Images are highly complex multidimensional signals, with rich and complicated information content. For this reason they are difficult to analyze through a unique automated approach...
Even though "providing insight" has been considered one of the main purposes of information visualization (InfoVis), we feel that insight is still a not-well-understood ...
Ji Soo Yi, Youn ah Kang, John T. Stasko, Julie A. ...