Sciweavers

92 search results - page 2 / 19
» HTML Pattern Generator--Automatic Data Extraction from Web P...
Sort
View
VLDB
2001
ACM
144views Database» more  VLDB 2001»
13 years 10 months ago
RoadRunner: Towards Automatic Data Extraction from Large Web Sites
The paper investigates techniques for extracting data from HTML sites through the use of automatically generated wrappers. To automate the wrapper generation and the data extracti...
Valter Crescenzi, Giansalvatore Mecca, Paolo Meria...
DEXA
2005
Springer
109views Database» more  DEXA 2005»
13 years 11 months ago
An XML Approach to Semantically Extract Data from HTML Tables
Abstract. Data intensive information is often published on the internet in the format of HTML tables. Extracting some of the information that is of users’ interest from the inter...
Jixue Liu, Zhuoyun Ao, Ho-Hyun Park, Yongfeng Chen
IPM
2007
149views more  IPM 2007»
13 years 5 months ago
Web page title extraction and its application
This paper is concerned with automatic extraction of titles from the bodies of HTML documents (web pages). Titles of HTML documents should be correctly defined in the title fields...
Yewei Xue, Yunhua Hu, Guomao Xin, Ruihua Song, Shu...
WEBDB
1999
Springer
196views Database» more  WEBDB 1999»
13 years 10 months ago
Web Ecology: Recycling HTML Pages as XML Documents Using W4F
In this paper we present the World-Wide Web Wrapper Factory (W4F), a Java toolkit to generate wrappers for Web data sources. Some key features of W4F are an expressive language to...
Arnaud Sahuguet, Fabien Azavant
ERLANG
2006
ACM
13 years 11 months ago
From HTTP to HTML: Erlang/OTP experiences in web based service applications
This paper describes the lessons learnt when internally developing web applications in Erlang. On the basis of these experiences, a framework called the Web Platform has been impl...
Francesco Cesarini, Lukas Larsson, Michal Slaski