We show that document image decoding (DID) supervised training algorithms, as a result of recent refinements, achieve high accuracy with low manual effort even under conditions of severe image degradation in both training and test data. We describe improvements in DID training of character template, set-width, and channel (noise) models. Large-scale experimental trials, using synthetically degraded images of text, have established two new and practically important advantages of DID algorithms:
Prateek Sarkar, Henry S. Baird, Xiaohu Zhang