This paper reports a document retrieval technique that retrieves machine-printed Latin-based document images through word shape coding. Adopting the idea of image annotation, a wo...
This paper presents PDF-TREX, an heuristic approach for table recognition and extraction from PDF documents. The heuristics starts from an initial set of basic content elements an...
This paper proposes a multi-signature document identification method that works robustly with lowresolution documents captured from handheld devices. The proposed method is based ...
This paper describes features and methods for document image comparison and classification at the spatial layout level. The methods are useful for visual similarity based document...
Jianying Hu, Ramanujan S. Kashi, Gordon T. Wilfong