Text categorization algorithms usually represent documents as bags of words and consequently have to deal with huge numbers of features. Most previous studies found that the major...
The success of a software project is largely dependent upon the quality of the Software Requirements Specification (SRS) document, which serves as a medium to communicate user req...
Consider a supervised learning problem in which examples contain both numerical- and text-valued features. To use traditional featurevector-based learning methods, one could treat...
Sofus A. Macskassy, Haym Hirsh, Arunava Banerjee, ...
This work addresses the problem of classifying the genre of text, which is useful for a variety of language processing problems. We propose statistics of POS histograms as classiï...
Sergey Feldman, Marius A. Marin, Mari Ostendorf, M...
Text classification is one of the most actual among the natural language processing problems. In this paper the application of word-based PPM (Prediction by Partial Matching) mode...