Sciweavers

IJMSO
2008
149views more  IJMSO 2008»
13 years 4 months ago
Categorisation of web documents using extraction ontologies
: Automatically recognising which HTML documents on the Web contain items of interest for a user is non-trivial. As a step toward solving this problem, we propose an approach based...
Li Xu, David W. Embley
CSREAEEE
2006
154views Business» more  CSREAEEE 2006»
13 years 6 months ago
Structural Discovery of E-lessons
An e-lesson is comprised of a "body" and a "view". The body is the actual content of the e-lesson and the assumption is that it is an html document. The view i...
Azita Bahrami
ISEC
2001
Springer
180views ECommerce» more  ISEC 2001»
13 years 9 months ago
i-Cube: A Tool-Set for the Dynamic Extraction and Integration of Web Data Content
Over the past decade the Internet has evolved into the largest public community in the world. It provides a wealth of data content and services in almost every field of science, t...
Frankie Poon, Kostas Kontogiannis
ISMIS
2003
Springer
13 years 9 months ago
MetaNews: An Information Agent for Gathering News Articles on the Web
This paper presents MetaNews, an information gathering agent for news articles on the Web. MetaNews reads HTML documents from online news sites and extracts article information fro...
Dae-Ki Kang, Joongmin Choi
SIGIR
2005
ACM
13 years 10 months ago
Title extraction from bodies of HTML documents and its application to web page retrieval
This paper is concerned with automatic extraction of titles from the bodies of HTML documents. Titles of HTML documents should be correctly defined in the title fields; however, i...
Yunhua Hu, Guomao Xin, Ruihua Song, Guoping Hu, Sh...
WWW
2006
ACM
14 years 5 months ago
HTML2RSS: automatic generation of RSS feed based on structure analysis of HTML document
We present a system to automatically generate RSS feeds from HTML documents that consist of time-series items with date expressions, e.g., archives of weblogs, BBSs, chats, mailin...
Tomoyuki Nanno, Manabu Okumura