Sciweavers

244 search results - page 4 / 49
» From HTML documents to web tables and rules
Sort
View
WWW
2002
ACM
14 years 6 months ago
A machine learning based approach for table detection on the web
Table is a commonly used presentation scheme, especially for describing relational information. However, table understanding remains an open problem. In this paper, we consider th...
Yalin Wang, Jianying Hu
ICMCS
1999
IEEE
131views Multimedia» more  ICMCS 1999»
13 years 10 months ago
Integrating Web Resources and Lexicons into a Natural Language Query System
The START system responds to natural language queries with answers in text, pictures, and other media. START's sentence-level natural language parsing relies on a number of m...
Boris Katz, Deniz Yuret, Jimmy J. Lin, Sue Felshin...
CLEIEJ
2008
72views more  CLEIEJ 2008»
13 years 5 months ago
Measuring Contribution of HTML Features in Web Document Clustering
Documents in HTML format have many features to analyze, from the terms in special sections to the phrases that appear in the whole document. However, it is important to decide whi...
Esteban Meneses, Oldemar Rodríguez-Rojas
CACM
1998
110views more  CACM 1998»
13 years 5 months ago
Viewing WISs as Database Applications
abstraction for modeling these problems is to view the Web as a collection of (usually small and heterogeneous) databases, and to view programs that extract and process Web data au...
Gustavo O. Arocena, Alberto O. Mendelzon
WWW
2003
ACM
14 years 6 months ago
DOM-based content extraction of HTML documents
Web pages often contain clutter (such as pop-up ads, unnecessary images and extraneous links) around the body of an article that distracts a user from actual content. Extraction o...
Suhit Gupta, Gail E. Kaiser, David Neistadt, Peter...