Abstract. Graphics recognition deals with the specific pattern recognition problems found in graphics-rich documents, typical technical documentation of all kinds. In this paper, w...
In this paper, we propose a new similarity measure to compute the pairwise similarity of text-based documents based on suffix tree document model. By applying the new suffix tree ...
The availability of large, heterogeneous repositories of electronic documents is increasing rapidly, and the need for flexible, sophisticated document manipulation tools is growi...
Floriana Esposito, Stefano Ferilli, Teresa Maria A...
Topic segmentation and identification are often tackled as separate problems whereas they are both part of topic analysis. In this article, we study how topic identification can...