Japanese OCR Error Correction using Character Shape Similarity and Statistical Language Model

13 years 5 months ago

Download www.aclweb.org

We present a novel OCR error correction method for languages without word delimiters that have a large character set, such as Japanese and Chinese. It consists of a statistical OCR model, an approximate word matching method using character shape similarity, and a word segmentation algorithm using a statistical language model. By using a statistical OCR model and character shape similarity, the proposed error corrector outperforms the previously published method. When the baseline character recognition accuracy is 90%, it achieves 97.4% character recognition accuracy.

Masaaki Nagata

Real-time Traffic

ACL 1998 | ACL 2007 | Character Recognition Accuracy | Character Shape Similarity | Statistical Ocr Model |

claim paper

Added	01 Nov 2010
Updated	01 Nov 2010
Type	Conference
Year	1998
Where	ACL
Authors	Masaaki Nagata

Sciweavers

Japanese OCR Error Correction using Character Shape Similarity and Statistical Language Model

ACL 1998 | ACL 2007 | Character Recognition Accuracy | Character Shape Similarity | Statistical Ocr Model |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers