Sciweavers

59 search results - page 3 / 12
» Web article extraction for web printing: a DOM visual based ...
Sort
View
AAAI
2007
13 years 7 months ago
Template-Independent News Extraction Based on Visual Consistency
Wrapper is a traditional method to extract useful information from Web pages. Most previous works rely on the similarity between HTML tag trees and induced template-dependent wrap...
Shuyi Zheng, Ruihua Song, Ji-Rong Wen
WWW
2009
ACM
13 years 10 months ago
Extracting data records from the web using tag path clustering
Fully automatic methods that extract lists of objects from the Web have been studied extensively. Record extraction, the first step of this object extraction process, identifies...
Gengxin Miao, Jun'ichi Tatemura, Wang-Pin Hsiung, ...
IV
2005
IEEE
161views Visualization» more  IV 2005»
13 years 11 months ago
Adaptive Site Map Visualization Based on Landmarks
Site maps are frequently provided on Web sites as a navigation support for Web users. The automatic generation of site maps is a complex task since the structure of the data, sema...
Dirk Kukulenz
IPM
2007
149views more  IPM 2007»
13 years 5 months ago
Web page title extraction and its application
This paper is concerned with automatic extraction of titles from the bodies of HTML documents (web pages). Titles of HTML documents should be correctly defined in the title fields...
Yewei Xue, Yunhua Hu, Guomao Xin, Ruihua Song, Shu...
WWW
2007
ACM
14 years 6 months ago
Towards domain-independent information extraction from web tables
Traditionally, information extraction from web tables has focused on small, more or less homogeneous corpora, often based on assumptions about the use of <table> tags. A mul...
Bernhard Krüpl, Bernhard Pollak, Marcus Herzo...