Many information sources use multiple modalities, such as textbooks, which contain both text and diagrams. Each captures information that is hard to express in the other, and evid...
: We propose a method for text retrieval from document images without the use of OCR. Documents are segmented into character objects. Image features, namely the Vertical Traverse D...
Text documents often contain valuable structured data that is hidden in regular English sentences. This data is best exploited if available as a relational table that we could use...
This paper proposes a new unsupervised learning method for obtaining English part-ofspecch(POS) disambiguation rules which would improve thc accuracy of a POS tagger. This method ...
Tiffs work belongs to a family of research efforts, called nficrotheories and aimed at describing the static inemfing of all lexical categories in several languages in the fr,...