A new dictionary-based text categorization approach is proposed to classify the chemical web pages efficiently. Using a chemistry dictionary, the approach can extract chemistry-re...
Chunyan Liang, Li Guo, Zhaojie Xia, Feng-Guang Nie...
Databases of text and text-annotated data constitute a significant fraction of the information available in electronic form. Searching and browsing are the typical ways that users ...
Abstract. This paper proposes the use of Latent Semantic Indexing (LSI) techniques, decomposed with semi-discrete matrix decomposition (SDD) method, for text categorization. The SD...
Associative classification, which originates from numerical data mining, has been applied to deal with text data recently. Text data is firstly digitalized to database of transact...
Baoli Li, Neha Sugandh, Ernest V. Garcia, Ashwin R...
Although text categorization is a burgeoning area of IR research, readily available test collections in this field are surprisingly scarce. We describe a methodology and system (...