We present a novel technique for segmentation of a JPEGcompressed documentbased on block activity. The activity is measured as the number of bits spent to encode each block. Each ...
With an aim to extract the structural information from the table of contents (TOC) to help develop digital document library the requirement of identifying/segmenting the TOC page ...
S. Mandal, S. P. Chowdhury, Amit Kumar Das, Bhabat...
This paper presents a novel block-based segmentation and adaptive coding(BSAC) algorithm for visually lossless compression of scanned documents that contain not only photographic ...
In this paper, we propose a word shape recognition method for retrieving image-based documents. Document images are segmented at the word level first. Then the proposed method det...
Page segmentation into text and non-text components is an essential preprocessing step before OCR operation. If this is not done properly, an OCR classification engine produces g...
Syed Saqib Bukhari, Faisal Shafait, Thomas M. Breu...