Sciweavers

ICDAR
2009
IEEE

Enhanced Text Extraction from Arabic Degraded Document Images Using EM Algorithm

13 years 11 months ago
Enhanced Text Extraction from Arabic Degraded Document Images Using EM Algorithm
This paper presents a new enhanced text extraction algorithm from degraded document images on the basis of the probabilistic models. The observed document image is considered as a mixture of Gaussian densities which represents the foreground and background document image components. The EM algorithm is introduced in order to estimate and improve the parameters of the mixtures of densities recursively. The initial parameters of the EM algorithm are estimated by the k-means clustering method. After the parameter estimation, the document image is partitioned into text and background classes by the means of ML approach. The performance of the proposed approach is evaluated on a variety of degraded documents comes from the collections of the National library of Tunisia.
Wafa Boussellaa, Aymen Bougacha, Abderrazak Zahour
Added 21 May 2010
Updated 21 May 2010
Type Conference
Year 2009
Where ICDAR
Authors Wafa Boussellaa, Aymen Bougacha, Abderrazak Zahour, Haikal El Abed, Adel M. Alimi
Comments (0)