In this paper, we propose a new classification method that addresses classification in multiple categories of textual documents. We call it Matrix Regression (MR) due to its resem...
Iulian Sandu Popa, Karine Zeitouni, Georges Gardar...
We consider a parsed text corpus as an instance of a labelled directed graph, where nodes represent words and weighted directed edges represent the syntactic relations between the...
We offer, in this paper, a new method to segment text in natural scenes. This method is based on the use of a morphological operator: the Toggle Mapping. The efficiency of the met...
Jonathan Fabrizio, Beatriz Marcotegui, Matthieu Co...
This paper presents a cluster-based text categorization system which uses class distributional clustering of words. We propose a new clustering model which considers the global in...
In many important text classification problems, acquiring class labels for training documents is costly, while gathering large quantities of unlabeled data is cheap. This paper sh...
Kamal Nigam, Andrew McCallum, Sebastian Thrun, Tom...