Glyph extraction from historic document images

9 years 6 months ago
Glyph extraction from historic document images
This paper is about the reproduction of ancient texts with vectorised fonts. While for OCR only recognition rates count, a reproduction process does not necessarily require the recognition of characters. Our system aims at extracting all characters from printed historic documents without the employment of knowledge of language, font, or writing system. It searches for the best prototypes and creates a documentspecific font from these glyphs. To reach this goal, many common OCR preprocessing steps are no longer adequate. We describe the necessary changes of our system that deals particularly with documents typeset in Fraktur. On the one hand, algorithms are described that extract glyphs accurately for the purpose of precise reproduction. On the other hand, classification results of extracted Fraktur glyphs are presented for different shape descriptors. Categories and Subject Descriptors I.4.3 [Image Processing and Computer Vision]: Enhancement; I.5.3 [Pattern Recognition]: Clustering G...
Lothar Meyer-Lerbs, Arne Schuldt, Björn Gottf
Added 08 Nov 2010
Updated 08 Nov 2010
Type Conference
Year 2010
Authors Lothar Meyer-Lerbs, Arne Schuldt, Björn Gottfried
Comments (0)