An Effective and Robust Method for Short Text Classification

13 years 2 months ago
An Effective and Robust Method for Short Text Classification
Classification of texts potentially containing a complex and specific terminology requires the use of learning methods that do not rely on extensive feature engineering. In this work we use prediction by partial matching (PPM), a method that compresses texts to capture text features and creates a language model adapted to a particular text. We show that the method achieves a high accuracy of text classification and can be used as an alternative to state-of-art learning algorithms. Motivation We focus on classification of texts with a high concentration of a specific terminology and complex grammatical structures. Those characteristics inevitably complicate standard feature engineering, which is done by language pre-processing ( e.g., lemmatization, parsing) that is further complicated when the texts are short. Our goal is to avoid complex and, perhaps, error-prone feature construction by using a learning method that can perform reasonably well without preliminary feature engineering. ...
Victoria Bobicev, Marina Sokolova
Added 02 Oct 2010
Updated 02 Oct 2010
Type Conference
Year 2008
Where AAAI
Authors Victoria Bobicev, Marina Sokolova
Comments (0)