Word segmentation of off-line handwritten documents

13 years 8 months ago
Word segmentation of off-line handwritten documents
Word segmentation is the most critical pre-processing step for any handwritten document recognition/retrieval system. This paper describes an approach to separate a line of unconstrained (written in a natural manner) handwritten text into words. When the writing style is unconstrained, recognition of individual components may be unreliable so they must be grouped together into word hypotheses, before recognition algorithms can be used. Our approach uses a set of both local and global features, which is motivated by the way that human beings perform this kind of task. In addition, in order to overcome the disadvantage of different distance measures, we propose an average distance computed using three different methods. The system is evaluated using an unconstrained handwriting database, which contains 50 pages (1026 line, 7562 words images) handwritten documents. The overall accuracy is 90.82%, which shows a better performance than a pervious method.
Chen Huang, Sargur N. Srihari
Added 29 Oct 2010
Updated 29 Oct 2010
Type Conference
Year 2008
Where DRR
Authors Chen Huang, Sargur N. Srihari
Comments (0)