Sciweavers

ICDAR
2003
IEEE

Word Segmentation of Handwritten Dates in Historical Documents by Combining Semantic A-Priori-Knowledge with Local Features

13 years 9 months ago
Word Segmentation of Handwritten Dates in Historical Documents by Combining Semantic A-Priori-Knowledge with Local Features
The recognition of script in historical documents requires suitable techniques in order to identify single words. Segmentation of lines and words is a challenging task because lines are not straight and words may intersect within and between lines. For correct word segmentation, the conventional analysis of distances between text objects needs to be supplemented by a second component predicting possible word boundaries based on semantical information. For date entries, hypotheses about potential boundaries are generated based on knowledge about the different variations as to how dates are written in the documents. It is modeled by distribution curves for potential boundary locations. Word boundaries are detected by classification of local features, such as distances between adjacent text objects, together with location-based boundary distribution curves as a-priori knowledge. We applied the technique to date entries in historical church registers. Documents from the 18th and 19th cen...
Markus Feldbach, Klaus D. Tönnies
Added 04 Jul 2010
Updated 04 Jul 2010
Type Conference
Year 2003
Where ICDAR
Authors Markus Feldbach, Klaus D. Tönnies
Comments (0)