— For Optical Character Recognition (OCR) of bilingual or multilingual document containing text words in regional language and numerals in English, it is necessary to identify di...
This paper presents a novel block-based segmentation and adaptive coding(BSAC) algorithm for visually lossless compression of scanned documents that contain not only photographic ...
This paper addresses the problem of extracting information from textual documents, either normal documents or web pages. A new approach for extracting complicate information from ...
Luo Xiao, Dieter Wissmann, Michael Brown, Stefan J...
In digital libraries image retrieval queries can be based on the similarity of objects, using several feature attributes like shape, texture, color or text. Such multi-feature que...
In this paper we present a semi-automatic ontology editor as implemented in a new version of OntoGen system. The system integrates machine learning and text mining algorithms into ...