Sciweavers

244 search results - page 3 / 49
» From HTML documents to web tables and rules
Sort
View
IJCAI
2003
13 years 6 months ago
Information Extraction from Tree Documents by Learning Subtree Delimiters
Information extraction from HTML pages has been conventionally treated as plain text documents extended with HTML tags. However, the growing maturity and correct usage of HTML/XHT...
Boris Chidlovskii
SIGIR
2005
ACM
13 years 10 months ago
Title extraction from bodies of HTML documents and its application to web page retrieval
This paper is concerned with automatic extraction of titles from the bodies of HTML documents. Titles of HTML documents should be correctly defined in the title fields; however, i...
Yunhua Hu, Guomao Xin, Ruihua Song, Guoping Hu, Sh...
SAINT
2005
IEEE
13 years 10 months ago
Learning Logic Wrappers for Information Extraction from the Web
This paper discusses a methodology for applying general-purpose first-order inductive learning to extract information from Web documents structured as unranked ordered trees. The...
Costin Badica, Elvira Popescu, Amelia Badica
WOA
2001
13 years 6 months ago
Object Oriented Mapping for HTML Documents
Emerging distributed technologies aim to provide simple and powerful tools for web services design and implementation. Main vendors provide modern frameworks so that a good coordi...
Francesco Garelli, Carlo Ferrari
AAAI
1997
13 years 6 months ago
Template-Based Information Mining from HTML Documents
Tools for mining information from data can create added value for the Internet. As the majority of electronic documents available over the network are in unstructured textual form...
Jane Yung-jen Hsu, Wen-tau Yih