Sciweavers

ICPR
2008
IEEE

Improvements in hidden Markov model based Arabic OCR

13 years 11 months ago
Improvements in hidden Markov model based Arabic OCR
This paper describes recent advances in hidden Markov model (HMM) based OCR for machine-printed Arabic documents. A combination of scriptindependent and script-specific techniques are applied to glyph models and language models (LM). Scriptindependent techniques we applied are higher order ngram LMs for N-best rescoring and discriminative estimation of glyph HMMs. Arabic specific techniques include the use of context-dependent HMMs for glyph modeling and Parts-of-Arabic-Words in language modeling. We present experimental results that demonstrate a 40% relative reduction in word error rate over the baseline configuration on a corpus of machine-printed Arabic documents.
Rohit Prasad, Shirin Saleem, Matin Kamali, Ralf Me
Added 30 May 2010
Updated 30 May 2010
Type Conference
Year 2008
Where ICPR
Authors Rohit Prasad, Shirin Saleem, Matin Kamali, Ralf Meermeier, Premkumar Natarajan
Comments (0)