Large-scale text categorization is an important research topic for Web data mining. One of the challenges in large-scale text categorization is how to reduce the amount of human e...
In spite of the high profile of media types such as video, audio and images, many multimedia presentations rely extensively on text content. Text can be used for incidental labels...
Dick C. A. Bulterman, A. J. Jansen, Pablo Cé...
Dataset shift from the training data in a source domain to the data in a target domain poses a great challenge for many statistical learning methods. Most algorithms can be viewed ...
This paper investigates the use of stemming for classification of Dutch (email) texts. We introduce a stemmer, which combines dictionary lookup (implemented efficiently as a finit...
Naïve Bayes (NB) classifier has long been considered a core methodology in text classification mainly due to its simplicity and computational efficiency. There is an increasing n...