Document clustering techniques mostly depend on models that impose explicit and/or implicit priori assumptions as to the number, size, disjunction characteristics of clusters, and/...
With the rise of the Internet, virtual communities of practice are gaining importance as a mean of sharing and exchanging information. In such environments, information reuse is of...
Document fields, such as the title or the headings of a document, offer a way to consider the structure of documents for retrieval. Most of the proposed approaches in the literatu...
Retrieval accuracy can be improved by considering which document type should be filtered out and which should be ranked higher in the result list. Hence, document type can be used...
This paper describes the general structure of a full automated document analysis system for printed documents. The system is based on a character preclassification stage which red...