Genetic Algorithms in Syllable-Based Text Compression

12 years 4 months ago
Genetic Algorithms in Syllable-Based Text Compression
Abstract. Syllable based text compression is a new approach to compression by symbols. In this concept syllables are used as the compression symbols instead of the more common characters or words. This new technique has proven itself worthy especially on short to middle-length text files. The effectiveness of the compression is greatly affected by the quality of dictionaries of syllables characteristic for the certain language. These dictionaries are usually created with a straight-forward analysis of text corpora. In this paper we would like to introduce an other way of obtaining these dictionaries – using genetic algorithm. We believe, that dictionaries built this way, may help us lower the compress ratio. We will measure this effect on a set of Czech and English texts.
Tomas Kuthan, Jan Lansky
Added 29 Oct 2010
Updated 29 Oct 2010
Type Conference
Year 2007
Authors Tomas Kuthan, Jan Lansky
Comments (0)