: We propose a method for text retrieval from document images without the use of OCR. Documents are segmented into character objects. Image features, namely the Vertical Traverse D...
A spanned cell in a table is a single, complete unit that physically occupies multiple columns and/or multiple rows. Spanned cells are common in tables, and they are a significan...
An oracle is described for dynamic validation of an application (metadata extraction from scanned documents) where a moderate failure rate is acceptable provided that instances of...
Kurt Maly, Steven J. Zeil, Mohammad Zubair, Ashraf...
Line detection algorithms constitute the basis for technical document analysis and recognition. The performance of these algorithms decreases as the quality of the documents degra...
A document image analysis toolbox, including a collection of data structures and algorithms to suppbrt a variety of applications, is described in this paper. An experimental envir...
Jisheng Liang, Richard Rogers, Robert M. Haralick,...