We introduce an approach to the automatic acquisition of new concepts fi'om natural language texts which is tightly integrated with the underlying text understanding process....
This paper presents our work on automatically locating charts from document pages, which is an important stage in the chart image recognition and understanding system being develo...
As camera resolution increases, high-speed non-contact text capture through a digital camera is opening up a new channel for text capture and understanding. Unfortunately, the cap...
A large annotated corpus is critical to the development of robust optical character recognizers (OCRs). However, creation of annotated corpora is a tedious task. It is laborious, ...
ABSTRACT: OCR is an error-prone process. It is time-consuming and expensive to manually proofread OCR results. The errors remaining in OCRed texts can cause serious problems in rea...