This paper presents the XML-based formats ALTO, TEI, METS used for Digital Libraries and their interest for data representation in a Document Image Analysis and Recognition (DIAR)...
In this paper we present a recursive algorithm for the cleaning and the enhancing of historical documents. Most of the algorithms, used to clean and enhance documents or transform ...
In this paper, we report our approach to retrieve patent documents based on the prior art. We use the standard Information Retrieval (IR) techniques which contain indexing and retr...
Usually software is maintained by people different from those who developed it. In this context the maintenance activities are dominated by the comprehension effort. The study of ...
Abstract—A text filtering system monitors a stream of incoming documents, to identify those that match the interest profiles of its users. The user interests are registered at ...