Sciweavers

354 search results - page 1 / 71
» Topic based language models for OCR correction
Sort
View
SIGIR
2008
ACM
13 years 4 months ago
Topic based language models for OCR correction
Anurag Bhardwaj, Faisal Farooq, Huaigu Cao, Venu G...
ICDAR
2007
IEEE
13 years 11 months ago
Context-Sensitive Error Correction: Using Topic Models to Improve OCR
Modern optical character recognition software relies on human interaction to correct misrecognized characters. Even though the software often reliably identifies low-confidence ...
Michael L. Wick, Michael G. Ross, Erik G. Learned-...
DRR
2010
13 years 7 months ago
Efficient automatic OCR word validation using word partial format derivation and language model
In this paper we present an OCR validation module, implemented for the System for Preservation of Electronic Resources (SPER) developed at the U.S. National Library of Medicine.1 ...
Siyuan Chen, Dharitri Misra, George R. Thoma
ACL
1998
13 years 6 months ago
Japanese OCR Error Correction using Character Shape Similarity and Statistical Language Model
We present a novel OCR error correction method for languages without word delimiters that have a large character set, such as Japanese and Chinese. It consists of a statistical OC...
Masaaki Nagata
DRR
2003
13 years 6 months ago
Information retrieval for OCR documents: a content-based probabilistic correction model
The difficulty with information retrieval for OCR documents lies in the fact that OCR documents comprise of a significant amount of erroneous words and unfortunately most informat...
Rong Jin, ChengXiang Zhai, Alexander G. Hauptmann