This paper presents a new enhanced text extraction algorithm from degraded document images on the basis of the probabilistic models. The observed document image is considered as a...
Although detecting text lines in machine printed documents is typically considered a solved problem, it is still a challenge to segment handwritten text lines in the general sense...
This paper proposes a word segmentation method for machine-printed text lines. It utilizes gaps and special symbols as delimiters between words. A gap clustering technique is used...
Soo-Hyung Kim, Chang Bu Jeong, Hee K. Kwag, Ching ...
In this paper a complete OCR methodology for recognizing historical documents, either printed or handwritten without any knowledge of the font, is presented. This methodology cons...
We explore automated discovery of topicallycoherent segments in speech or text sequences. We give two new discriminative topic segmentation algorithms which employ a new measure o...