Abstract. Requirements engineering, the first phase of any software development project, is the Achilles’ heel of the whole development process, as requirements documents are of...
In this paper, we propose a document clustering method that strives to achieve: (1) a high accuracy of document clustering, and (2) the capability of estimating the number of clus...
ABSTRACT: OCR is an error-prone process. It is time-consuming and expensive to manually proofread OCR results. The errors remaining in OCRed texts can cause serious problems in rea...
Abstract. Effective and efficient management and manipulation of XML documents requires stable decisions at the time a document enters the XML DBMS to provide for storage structure...
Abstract. In this paper we present a system, DoLSuD, for the automatic discovery of relevant substructures in a document layout. DoLSuD, Document Layout Substructure Discovery, ext...