Sciweavers

Free Online Productivity Tools i2Speak i2Symbol i2OCR iTex2Img iWeb2Print iWeb2Shot i2Type iPdf2Split iPdf2Merge i2Bopomofo i2Arabic i2Style i2Image i2PDF iLatex2Rtf Sci2ools

14

CSJM
2006

favoriteEmaildiscussreport

124views more CSJM 2006»

Text Classification Using Word-Based PPM Models

13 years 4 months ago

Text Classification Using Word-Based PPM Models

Download www.math.md

Text classification is one of the most actual among the natural language processing problems. In this paper the application of word-based PPM (Prediction by Partial Matching) model for automatic content-based text classification is described. Our main idea is that words and especially word combinations are more relevant features for many text classification tasks. Key-words for a document in most cases are not just single words but combination of two or three words. The main result of the implemented experiments proved applicability of word-based PPM models for content-based text classification. Although in some cases the entropy difference which influenced the choice was rather small (several hundredths), most of the documents (up to 97%) were classified correctly.

Victoria Bobicev

Real-time Traffic

Content-based Text Classification | CSJM 2006 | Text Classification | Word-based Ppm |

claim paper

Related Content

» An Effective and Robust Method for Short Text Classification

» Improving PPM Algorithm Using Dictionaries

» WordBased Statistical Compressors as Natural Language Compression Boosters

» Combining Structural and Textual Contexts for Compressing Semistructured Databases

» Towards an optimal weighting of context words based on distance

» Using structural contexts to compress semistructured text collections

» Multilabel ASRS Dataset Classification Using Semi Supervised Subspace Clustering

» Lossless Compression Based on the Sequence Memoizer

» Bilingual Text Classification using the IBM 1 Translation Model

Post Info
More Details (n/a)

Added	11 Dec 2010
Updated	11 Dec 2010
Type	Journal
Year	2006
Where	CSJM
Authors	Victoria Bobicev

Comments (0)