Sciweavers

DCC
1992
IEEE

Constructing Word-Based Text Compression Algorithms

13 years 8 months ago
Constructing Word-Based Text Compression Algorithms
Text compression algorithms are normally defined in terms of a source alphabet of 8-bit ASCII codes. We consider choosing to be an alphabet whose symbols are the words of English or, in general, alternate maximal strings of alphanumeric characters and non-alphanumeric characters. The compression algorithm would be able to take advantage of longer-range correlations between words and thus achieve better compression. The large size of leads to some implementation problems, but these are overcome to construct word-based LZW, word-based Adaptive Huffman, and wordbased Context Modelling compression algorithms.
R. Nigel Horspool, Gordon V. Cormack
Added 10 Aug 2010
Updated 10 Aug 2010
Type Conference
Year 1992
Where DCC
Authors R. Nigel Horspool, Gordon V. Cormack
Comments (0)