Sciweavers

ICDAR
2007
IEEE

Example-Based Logical Labeling of Document Title Page Images

13 years 6 months ago
Example-Based Logical Labeling of Document Title Page Images
This paper presents a flexible and effective examplebased approach for labeling title pages which can be used for automated extraction of bibliographic data. The labels of interest are “Title”, “Author”, “Abstract” and “Affiliation”. The method takes a set of labeled document layouts and a single unlabeled document layout as input and finds the best matching layout in the set. The labels of this layout are used to label the new layout. The similarity measure for layouts combines structural layout similarity and textural similarity on the block-level. Experimental results yield accuracy rates from 94.8% to 99.6% obtained on the publicly available MARG dataset. This shows that our lightweight method has equivalent and partially better performance when compared to other more complex labeling methods known from the literature.
Joost van Beusekom, Daniel Keysers, Faisal Shafait
Added 19 Oct 2010
Updated 19 Oct 2010
Type Conference
Year 2007
Where ICDAR
Authors Joost van Beusekom, Daniel Keysers, Faisal Shafait, Thomas M. Breuel
Comments (0)