Document clustering techniques mostly rely on single term analysis of the document data set, such as the Vector Space Model. To better capture the structure of documents, the unde...
We present a major revision of the XPath benchmark known as XPathMark [1]. The new version splits into a functional test over a small educational document and a more elaborated per...
This paper presents an automatic orientation detection and categorization technique that is capable of detecting the orientation of multilingual documents with arbitrary skew and ...
Similarity measure of document images acts a crucial role in the area of document image retrieval. A method of measuring the similarity of CCITT Group 4 compressed document images...
While the IEEE P1500 standards working group is on the verge of recommending a standard test interface for "non-mergeable" cores, a need was felt to adopt a standard met...
Michael G. Wahl, Sudipta Bhawmik, Kamran Zarrineh,...