We discuss an intrinsic generalization of the suffix tree, designed to index a string of length n which has a natural partitioning into m multicharacter substrings or words. This ...
Bag-of-visual Words (BoW) image representation is getting popular in computer vision and multimedia communities. However, experiments show that the traditional BoW representation ...
Word segmentation is a crucial step for segmentation-free document analysis systems and is used for creating an index based on word matching. In this paper, we propose a novel met...
In word sense disambiguation (WSD), the heuristic of choosing the most common sense is extremely powerful because the distribution of the senses of a word is often skewed. The pro...
Diana McCarthy, Rob Koeling, Julie Weeds, John A. ...
The implementation of word spotting is not an easy procedure and it gets even worse in the case of historical documents since it requires character recognition and indexing of the...