The Convergence of Iterated Classification

9 years 5 months ago
The Convergence of Iterated Classification
We report an improved methodology for training a sequence of classifiers for document image content extraction, that is, the location and segmentation of regions containing handwriting, machine-printed text, photographs, blank space, etc. The resulting segmentation is pixel-accurate, and so accommodates a wide range of zone shapes (not merely rectangles). We have systematically explored the best scale (spatial extent) of features. We have found that the methodology is sensitive to ground-truthing policy, and especially to precision of ground-truth boundaries. Experiments on a diverse test set of 83 document images show that tighter ground-truth reduces per-pixel classification errors
Chang An, Henry S. Baird
Added 19 Oct 2010
Updated 19 Oct 2010
Type Conference
Year 2008
Where DAS
Authors Chang An, Henry S. Baird
Comments (0)