Sciweavers

WWW
2004
ACM

Experiments with persian text compression for web

14 years 5 months ago
Experiments with persian text compression for web
The increasing importance of Unicode for text encoding implies a possible doubling of data storage space and data transmission time, with a corresponding need for data compression. The approach presented in this paper aims to reduce the storage and the transmission time for Persian text files in web-based applications and Internet. The basic idea here is to compute the most repetitive n-grams in the Persian text and replace them by a single character in the user-defined sections of the Unicode. The compression will be done on the server side once and the decompression process is eliminated completely. The rendering process in the browser will do the decompression. There is no need for any additional program or add-ins for decompression to be installed on the browser or client side. The user needs only to download the proper Unicode font once. A genetic algorithm is utilized to select the most appropriate n-grams. In the best case, we have achieved 52.26 % reduction of the file size. T...
Farhad Oroumchian, Ehsan Darrudi, Fattane Taghiyar
Added 22 Nov 2009
Updated 22 Nov 2009
Type Conference
Year 2004
Where WWW
Authors Farhad Oroumchian, Ehsan Darrudi, Fattane Taghiyareh, Neeyaz Angoshtari
Comments (0)