Sciweavers

CBMS
2007
IEEE

Text Categorization for Multi-label Documents and Many Categories

13 years 11 months ago
Text Categorization for Multi-label Documents and Many Categories
In this paper, we propose a new classification method that addresses classification in multiple categories of textual documents. We call it Matrix Regression (MR) due to its resemblance to regression in a high dimensional space. Experiences on a medical corpus of hospital records to be classified by ICD (International Classification of Diseases) code demonstrate the validity of the MR approach. We compared MR with three frequently used algorithms in text categorization that are k-Nearest Neighbors, Centroide and Support Vector Machine. The experimental results show that our method outperforms them in both precision and time of classification.
Iulian Sandu Popa, Karine Zeitouni, Georges Gardar
Added 02 Jun 2010
Updated 02 Jun 2010
Type Conference
Year 2007
Where CBMS
Authors Iulian Sandu Popa, Karine Zeitouni, Georges Gardarin, Didier Nakache, Elisabeth Métais
Comments (0)