We describe an approach to unsupervised high-accuracy recognition of the textual contents of an entire book using fully automatic mutual-entropy-based model adaptation. Given imag...
This paper describes features and methods for document image comparison and classification at the spatial layout level. The methods are useful for visual similarity based document...
Jianying Hu, Ramanujan S. Kashi, Gordon T. Wilfong
Structural information about a document is essential for structured query processing, indexing, and retrieval. A document page can be partitioned into a hierarchy of homogeneous r...
There is an increasingly pressing need to develop document analysis methods that are able to cope with images of documents containing printed regions of complex shapes. Contrary t...
Automatically generated content is ubiquitous in the web: dynamic sites built using the three-tier paradigm are good examples (e.g. commercial sites, blogs and other sites powered...