A Bilingual OCR for Hindi-Telugu Documents and its Applications

13 years 9 months ago

Download cvit.iiit.ac.in

This paper describes the character recognition process from printed documents containing Hindi and Telugu text. Hindi and Telugu are among the most popular languages in India. The bilingual recognizer is based on Principal Component Analysis followed by support vector classiﬁcation. This attains an overall accuracy of approximately 96.7%. Extensive experimentation is carried out on an independent test set of approximately 200000 characters. Applications based on this OCR are sketched.

C. V. Jawahar, M. N. S. S. K. Pavan Kumar, S. S. R

Real-time Traffic

Character Recognition Process | Document Analysis | Documents Containing Hindi | ICDAR 2003 | Telugu |

claim paper

» Nearest neighbor based collection OCR

» A Novel Italic Detection and Rectification Method for Chinese Advertising Images

» Chinese Keyword Spotting Using KnowledgeBased Clustering

» Bilingual web page and site readability assessment

» Offline Chinese handwriting recognition an assessment of current technology

» A Case Restoration Approach to Named Entity Tagging in Degraded Documents

» A Linear Grammar Approach to Mathematical Formula Recognition from PDF

Post Info
More Details (n/a)

Added	04 Jul 2010
Updated	04 Jul 2010
Type	Conference
Year	2003
Where	ICDAR
Authors	C. V. Jawahar, M. N. S. S. K. Pavan Kumar, S. S. Ravi Kiran

Comments (0)

Sciweavers

A Bilingual OCR for Hindi-Telugu Documents and its Applications

Character Recognition Process | Document Analysis | Documents Containing Hindi | ICDAR 2003 | Telugu |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers