We show that excluding outliers from the training data significantly improves kNN classifier, which in this case performs about 10% better than the best know method--Centroid-based...
Abstract. This paper reports our comparative evaluation of three machine learning methods on Chinese text categorization. Whereas a wide range of methods have been applied to Engli...
The number of patent documents is currently rising rapidly worldwide, creating the need for an automatic categorization system to replace time-consuming and labor-intensive manual...
Due to the exponential growth of documents on the Internet and the emergent need to organize them, the automated categorization of documents into predefined labels has received an...
Abstract: Multi-label learning originated from the investigation of text categorization problem, where each document may belong to several predefined topics simultaneously. In mul...