Comments left by readers on Web documents contain valuable information that can be utilized in different information retrieval tasks including document search, visualization, and ...
This paper presents an algorithm designed to segment and classify newspaper documents. A notable feature of this algorithm is the ability to detect lines in the document
In this paper we analyze our recent research on the use of document analysis techniques for metadata extraction from PDF papers. We describe a package that is designed to extract ...
Almost all document analysis approaches need to perform a global analysis of the page orientation as a separate process at an early stage. It would be preferable to estimate the o...
In order to overcome poor readability of text and recognizability of image features in low resolution thumbnails, a novel image representation of compound document images - a Smar...
Kathrin Berkner, Edward L. Schwartz, Christophe Ma...