This paper will present an approach that fosters a seamless integration of documents with corporate information systems. It is based on a conceptually enhanced notion of documents...
Text categorization involves mapping of documents to a fixed set of labels. A similar but equally important problem is that of assigning labels to large corpora. With a deluge of ...
As Chinese text is written without word boundaries, effectively recognizing Chinese words is like recognizing collocations in English, substituting characters for words and words ...
Line detection algorithms constitute the basis for technical document analysis and recognition. The performance of these algorithms decreases as the quality of the documents degra...
Many popular Web 2.0 sites support navigation of tagged web resources. The tag-based navigation has been described as a lightweight reorientation of view on tags and the associate...