Sciweavers

91 search results - page 4 / 19
» Web document text and images extraction using DOM analysis a...
Sort
View
NLDB
2004
Springer
13 years 11 months ago
A Flexible Workbench for Document Analysis and Text Mining
Abstract: Document analysis and text mining techniques are used to preprocess documents in information retrieval systems, to extract concepts in ontology construction processes, an...
Jon Atle Gulla, Terje Brasethvik, Harald Kaada
ICMCS
1999
IEEE
131views Multimedia» more  ICMCS 1999»
13 years 10 months ago
Integrating Web Resources and Lexicons into a Natural Language Query System
The START system responds to natural language queries with answers in text, pictures, and other media. START's sentence-level natural language parsing relies on a number of m...
Boris Katz, Deniz Yuret, Jimmy J. Lin, Sue Felshin...
ICDAR
2005
IEEE
13 years 11 months ago
Distinguishing Mathematics Notation from English Text using Computational Geometry
A trainable method for distinguishing between mathematics notation and natural language (here, English) in images of textlines, using computational geometry methods only with no a...
Derek M. Drake, Henry S. Baird
WWW
2009
ACM
14 years 6 months ago
Characterizing insecure javascript practices on the web
JavaScript is an interpreted programming language most often used for enhancing webpage interactivity and functionality. It has powerful capabilities to interact with webpage docu...
Chuan Yue, Haining Wang
APCCM
2009
13 years 7 months ago
Extracting and Modeling the Semantic Information Content of Web Documents to Support Semantic Document Retrieval
Existing HTML mark-up is used only to indicate the structure and lay-out of documents, but not the document semantics. As a result web documents are difficult to be semantically p...
Shahrul Azman Noah, Lailatulqadri Zakaria, Arifah ...