The effects of different image pre-processing methods for document image binarization are explored. They are compared on five different binarization methods on images with bleed t...
Elisa H. Barney Smith, Laurence Likforman-Sulem, J...
The use of gradients in text images is nowadays quite frequent. Existing segmentation methods encounter serious problems when it comes to modern text images where gradients might ...
This paper proposes a syntactic method for detection and correction of misrecognized mathematical formulae for a practical mathematical OCR system. Linear monadic context-free tre...
Engineering diagnosis often involves analyzing complex records of system states printed to large, textual log files. Typically the logs are designed to accommodate the widest debug...
We propose a visualization method based on a topic model for discrete data such as documents. Unlike conventional visualization methods based on pairwise distances such as multi-d...