Handwritten document images contain textlines with multi orientations, touching and overlapping characters within consecutive textlines, and small inter-line spacing making textli...
Syed Saqib Bukhari, Faisal Shafait, Thomas M. Breu...
This paper is concerned with automatic extraction of titles from the bodies of HTML documents. Titles of HTML documents should be correctly defined in the title fields; however, i...
This paper reports results for the University of Maryland’s participation in CLEF-2005 Cross-Language Speech Retrieval track. Techniques that were tried include: (1) document ex...
For the manual semantic markup of documents to become widespread, users must be able to express annotations that conform to ontologies (or schemas) that have shared meaning. Howev...
This paper studies the problem of identifying comparative sentences in text documents. The problem is related to but quite different from sentiment/opinion sentence identification...