A Shared Parts Model for Document Image Recognition

9 years 5 months ago
A Shared Parts Model for Document Image Recognition
We address document image classification by visual appearance. An image is represented by a variable-length list of visually salient features. A hierarchical Bayesian network is used to model the joint density of these features. This model promotes generalization from a few samples by sharing component probability distributions among different categories, and by factoring out a common displacement vector shared by all features within an image. The Bayesian network is implemented as a factor graph, and parameter estimation and inference are both done by loopy belief propagation. We explain and illustrate our model on a simple shape classification task. We obtain close to 90% accuracy on classifying journal articles from memos in the UWASH-II dataset, as well as on other classification tasks on a home-grown data set of technical articles.
M. Das Gupta, P. Sarkar
Added 03 Jun 2010
Updated 03 Jun 2010
Type Conference
Year 2007
Authors M. Das Gupta, P. Sarkar
Comments (0)