Abstract. This paper presents a quantitative comparison of six algorithms for page segmentation: X-Y cut, smearing, whitespace analysis, constrained text-line finding, Docstrum, an...
Abstract. Training and evaluation of techniques for handwriting recognition and retrieval is a challenge given that it is difficult to create large ground-truthed datasets. This is...
Abstract. Automatic identification of a script in a given document image facilitates many important applications such as automatic archiving of multilingual documents, searching on...
Gopal Datt Joshi, Saurabh Garg, Jayanthi Sivaswamy
This paper presents a novel approach for designing a semi-automatic adaptive OCR for large document image collections in digital libraries. We describe an interactive system for co...
Sachin Rawat, K. S. Sesh Kumar, Million Meshesha, ...
This paper describes a Neural Network (NN) approach for logical document structure extraction. In this NN architecture, called Transparent Neural Network (TNN), the document struct...