The method herein proposed detects text lines on handwritten pages which may include either lines oriented in several directions, erasures, or annotationsbetween main lines. The m...
This paper investigates the role of ontologies as a central part of an architecture to repurpose existing material from the web. A prototype system called ArtEquAKT is presented, ...
Mark J. Weal, Harith Alani, Sanghee Kim, Paul H. L...
We introduce a generative probabilistic document model based on latent Dirichlet allocation (LDA), to deal with textual errors in the document collection. Our model is inspired by...
Many document images are rich in color and have complex background. To detect text from them, a standard approach utilizes both color and binary information. This often leads to t...
In this paper we extend the state-of-the-art in utilizing background knowledge for supervised classification by exploiting the semantic relationships between terms explicated in O...
Meenakshi Nagarajan, Amit P. Sheth, Marcos Kawazoe...