Sciweavers

ICPR
2010
IEEE

A Self-Training Learning Document Binarization Framework

13 years 4 months ago
A Self-Training Learning Document Binarization Framework
—Document Image Binarization techniques have been studied for many years, and many practical binarization techniques have been developed and applied successfully on commercial document analysis systems. However, the current state-of-the-art methods, fail to produce good binarization results for many badly degraded document images. In this paper, we propose a self-training learning framework for document image binarization. Based on reported binarization methods, the proposed framework first divides document image pixels into three categories, namely, foreground pixels, background pixels and uncertain pixels. A classifier is then trained by learning from the document image pixels in the foreground and background categories. Finally, the uncertain pixels are classified using the learned pixel classifier. Extensive experiments have been conducted over the dataset that is used in the recent Document Image Binarization Contest(DIBCO) 2009. Experimental results show that our proposed f...
Bolan Su, Shijian Lu, Chew Lim Tan
Added 07 Dec 2010
Updated 07 Dec 2010
Type Conference
Year 2010
Where ICPR
Authors Bolan Su, Shijian Lu, Chew Lim Tan
Comments (0)