Scaling up document-image classifiers to handle an unlimited variety of document and image types poses serious challenges to conventional trainable classifier technologies. Highly...
The importance of text mining stems from the availability of huge volumes of text databases holding a wealth of valuable information that needs to be mined. Text categorization is...
We report an automatic feature discovery method that achieves results comparable to a manually chosen, larger feature set on a document image content extraction problem: the locat...
Automatic document classification is an important step in organizing and mining documents. Information in documents is often conveyed using both text and images that complement ea...
This paper describes an algorithm for the determination of zone content type of a given zone within a document image. We take a statistical based approach and represent each zone ...