Script Identification from Indian Documents

13 years 9 months ago
Script Identification from Indian Documents
Abstract. Automatic identification of a script in a given document image facilitates many important applications such as automatic archiving of multilingual documents, searching online archives of document images and for the selection of script specific OCR in a multilingual environment. In this paper, we present a scheme to identify different Indian scripts from a document image. This scheme employs hierarchical classification which uses features consistent with human perception. Such features are extracted from the responses of a multi-channel log-Gabor filter bank, designed at an optimal scale and multiple orientations. In the first stage, the classifier groups the scripts into five major classes using global features. At the next stage, a sub-classification is performed based on script-specific features. All features are extracted globally from a given text block which does not require any complex and reliable segmentation of the document image into lines and characters. Thus the p...
Gopal Datt Joshi, Saurabh Garg, Jayanthi Sivaswamy
Added 22 Aug 2010
Updated 22 Aug 2010
Type Conference
Year 2006
Where DAS
Authors Gopal Datt Joshi, Saurabh Garg, Jayanthi Sivaswamy
Comments (0)