Sciweavers

609 search results - page 8 / 122
» Adaptive record extraction from web pages
Sort
View
104
Voted
DASFAA
2005
IEEE
123views Database» more  DASFAA 2005»
15 years 1 months ago
Automatic Data Extraction from Data-Rich Web Pages
Abstract. Extracting data from web pages using wrappers is a fundamental problem arising in a large variety of applications of vast practical interests. In this paper, we propose a...
Dongdong Hu, Xiaofeng Meng
APWEB
2010
Springer
14 years 9 months ago
ECON: An Approach to Extract Content from Web News Page
Abstract--This paper provides a simple but effective approach, named ECON, to fully-automatically extract content from Web news page. ECON uses a DOM tree to represent the Web news...
Yan Guo, Huifeng Tang, Linhai Song, Yu Wang 0009, ...
LREC
2008
160views Education» more  LREC 2008»
15 years 1 months ago
Automatic Extraction of Textual Elements from News Web Pages
In this paper we present an algorithm for automatic extraction of textual elements, namely titles and full text, associated with news stories in news web pages. We propose a super...
Hossam Ibrahim, Kareem Darwish, Abdel-Rahim Madany
CLEF
2010
Springer
14 years 12 months ago
Person Attribute Extraction from the Textual Parts of Web Pages
We present the RGAI systems which participated in the third Web People Search Task challenge. The chief characteristics of our approach are that we focus on the raw textual parts o...
István Nagy, Richárd Farkas
SIGIR
2005
ACM
15 years 5 months ago
Title extraction from bodies of HTML documents and its application to web page retrieval
This paper is concerned with automatic extraction of titles from the bodies of HTML documents. Titles of HTML documents should be correctly defined in the title fields; however, i...
Yunhua Hu, Guomao Xin, Ruihua Song, Guoping Hu, Sh...