Sciweavers

DRR
2009

Text-image alignment for historical handwritten documents

13 years 2 months ago
Text-image alignment for historical handwritten documents
We describe our work on text-image alignment in context of building a historical document retrieval system. We aim at aligning images of words in handwritten lines with their text transcriptions. The images of handwritten lines are automatically segmented from the scanned pages of historical documents and then manually transcribed. To train automatic routines to detect words in an image of handwritten text, we need a training set - images of words with their transcriptions. We present our results on aligning words from the images of handwritten lines and their corresponding text transcriptions. Alignment based on the longest spaces between portions of handwriting is a baseline. We then show that relative lengths, i.e. proportions of words in their lines, can be used to improve the alignment results considerably. To take into account the relative word length, we define the expressions for the cost function that has to be minimized for aligning text words with their images. We apply rig...
Svitlana Zinger, John Nerbonne, Lambert Schomaker
Added 17 Feb 2011
Updated 17 Feb 2011
Type Journal
Year 2009
Where DRR
Authors Svitlana Zinger, John Nerbonne, Lambert Schomaker
Comments (0)