Word searching and indexing in historical document collections is a challenging problem because, characters in these documents are often touching or broken due to degradation/agei...
We investigate the problem of evaluating the performance of text processing algorithms on inputs that contain errors as a result of optical character recognition. A new hierarchic...
We consider approximate pattern matching in natural language text. We use the words of the text as the alphabet, instead of the characters as in traditional string matching approac...
Abstract. This paper describes a method for correction of optically read Devanagari character strings using a Hindi word dictionary. The word dictionary is partitioned in order to ...
We describe pictographic naming, a new approach to naming for pen-based computers, in which filenames are pictures rather than ASCII strings. Handwriting recognition (HWX) of a n...