This paper presents a new enhanced text extraction algorithm from degraded document images on the basis of the probabilistic models. The observed document image is considered as a...
A number of techniques have previously been proposed for effective thresholding of document images. In this paper two new thresholding techniques are proposed and compared against...
Graham Leedham, Yan Chen, Kalyan Takru, Joie Hadi ...
Abstract— Some of the established approaches to evaluating text clustering algorithms for information retrieval show theoretical flaws. In this paper, we analyze these flaws an...
Temporal Text Mining (TTM) is concerned with discovering temporal patterns in text information collected over time. Since most text information bears some time stamps, TTM has man...
The reconstruction of destroyed paper documents became of more interest during the last years. On the one hand it (often) occurs that documents are destroyed by mistake while on th...