Bayesian text classifiers face a common issue which is referred to as data sparsity problem, especially when the size of training data is very small. The frequently used Laplacian...
Labeling text data is quite time-consuming but essential for automatic text classification. Especially, manually creating multiple labels for each document may become impractical ...
This paper proposes a new approach for classifying text documents into two disjoint classes. The new approach is based on extracting patterns, in the form of two logical expressio...
In this paper we propose a new information-theoretic divisive algorithm for word clustering applied to text classification. In previous work, such "distributional clustering&...
Inderjit S. Dhillon, Subramanyam Mallela, Rahul Ku...
This paper describes an advanced system for multilingual text classification adaptable to different user needs. The system has been initially developed as an applied research proje...
Fabio Ciravegna, Alberto Lavelli, Nadia Mana, Joha...