This paper is concerned with the problem of Imbalanced Classification (IC) in web mining, which often arises on the web due to the "Matthew Effect". As web IC applicatio...
Frequent patterns provide solutions to datasets that do not have well-structured feature vectors. However, frequent pattern mining is non-trivial since the number of unique patter...
Wei Fan, Kun Zhang, Hong Cheng, Jing Gao, Xifeng Y...
Addressed in this paper is the issue of `email data cleaning' for text mining. Many text mining applications need take emails as input. Email data is usually noisy and thus i...
Merchants selling products on the Web often ask their customers to review the products that they have purchased and the associated services. As e-commerce is becoming more and mor...
In this paper, we propose a new approach to discover informative contents from a set of tabular documents (or Web pages) of a Web site. Our system, InfoDiscoverer, first partition...