Sciweavers

ICDAR
1999
IEEE

Partitioning and Searching Dictionary for Correction of Optically Read Devanagari Character Strings

13 years 9 months ago
Partitioning and Searching Dictionary for Correction of Optically Read Devanagari Character Strings
Abstract. This paper describes a method for correction of optically read Devanagari character strings using a Hindi word dictionary. The word dictionary is partitioned in order to reduce the search space besides preventing forced match to an incorrect word. The dictionary partitioning strategy takes into account the underlying OCR process. The dictionary words at the top level have been divided into two partitions, namely: short words partition and the remaining words partition. The short word partition is sub-partitioned using the envelop information of the words. The envelope consists of the number of top, lower, core modifiers along with the number of core characters. Devanagari characters are written in three strips. Most of the characters referred to as core characters are written in the middle strip. The remaining words are further partitioned using tags. A tag is a string of fixed length associated with each partition. The correction process uses a distance matrix for assigning ...
Veena Bansal, R. M. K. Sinha
Added 03 Aug 2010
Updated 03 Aug 2010
Type Conference
Year 1999
Where ICDAR
Authors Veena Bansal, R. M. K. Sinha
Comments (0)