In this paper we describe work relating to classification of web documents using a graph-based model instead of the traditional vector-based model for document representation. We ...
Adam Schenker, Mark Last, Horst Bunke, Abraham Kan...
Reverse engineering allows the geometric reconstruction of simple mechanical parts. However, the resulting models suffer from inaccuracies caused by errors in measurement and reco...
C. H. Gao, Frank C. Langbein, A. David Marshall, R...
—This paper proposes a model-based text line segmentation algorithm for machine-printed document images. The model is based on geometric configuration which uses the interline sp...
Whole-book recognition is a document image analysis strategy that operates on the complete set of a book’s page images, attempting to improve accuracy by automatic unsupervised ...
A spanned cell in a table is a single, complete unit that physically occupies multiple columns and/or multiple rows. Spanned cells are common in tables, and they are a significan...