: This research proposes a new strategy where documents are encoded into string vectors and modified version of KNN to be adaptable to string vectors for text categorization. Tradi...
Abstract. This paper proposes the use of Latent Semantic Indexing (LSI) techniques, decomposed with semi-discrete matrix decomposition (SDD) method, for text categorization. The SD...
We present a simple method for language independent and task independent text categorization learning, based on character-level n-gram language models. Our approach uses simple in...
On a multi-dimensional text categorization task, we compare the effectiveness of a feature based approach with the use of a stateof-the-art sequential learning technique that has ...
Abstract. Classification in genres and domains is a major field of research for Information Retrieval (scientific and technical watch, datamining, etc.) and the selection of app...