In this paper we present an integrated approach for semantic structure extraction in document images. Document images are initially processed to extract both their layout and logic...
Existing HTML mark-up is used only to indicate the structure and lay-out of documents, but not the document semantics. As a result web documents are difficult to be semantically p...
Abstract. In this paper we present a system, DoLSuD, for the automatic discovery of relevant substructures in a document layout. DoLSuD, Document Layout Substructure Discovery, ext...
The paper introduces an approach that organizes retrieval results semantically and displays them spatially for browsing. Latent Semantic Analysis as well as cluster techniques are...
This paper presents a complete system that historians/archivists can use to digitize whole collections of documents relating to personal information. The system integrates tools an...