Abstract: Document analysis and text mining techniques are used to preprocess documents in information retrieval systems, to extract concepts in ontology construction processes, an...
The START system responds to natural language queries with answers in text, pictures, and other media. START's sentence-level natural language parsing relies on a number of m...
Boris Katz, Deniz Yuret, Jimmy J. Lin, Sue Felshin...
A trainable method for distinguishing between mathematics notation and natural language (here, English) in images of textlines, using computational geometry methods only with no a...
JavaScript is an interpreted programming language most often used for enhancing webpage interactivity and functionality. It has powerful capabilities to interact with webpage docu...
Existing HTML mark-up is used only to indicate the structure and lay-out of documents, but not the document semantics. As a result web documents are difficult to be semantically p...