Learning-Free Text Categorization

10 years 11 months ago
Learning-Free Text Categorization
In this paper, we report on the fusion of simple retrieval strategies with thesaural resources in order to perform large-scale text categorization tasks. Unlike most related systems, which rely on training data in order to infer text-to-concept relationships, our approach can be applied with any controlled vocabulary and does not use any training data. The first classification module uses a traditional vector-space retrieval engine, which has been fine-tuned for the task, while the second classifier is based on regular variations of the concept list. For evaluation purposes, the system uses a sample of MedLine and the Medical Subject Headings (MeSH) terminology as collection of concepts. Preliminary results show that performances of the hybrid system are significantly improved as compared to each single system. For top returned concepts, the system reaches performances comparable to machine learning systems, while genericity and scalability issues are clearly in favor of the learn...
Patrick Ruch, Robert H. Baud, Antoine Geissbü
Added 06 Jul 2010
Updated 06 Jul 2010
Type Conference
Year 2003
Where AIME
Authors Patrick Ruch, Robert H. Baud, Antoine Geissbühler
Comments (0)