Sciweavers

92 search results - page 2 / 19
» HTML Pattern Generator--Automatic Data Extraction from Web P...
Sort
View
VLDB
2001
ACM
144views Database» more  VLDB 2001»
15 years 2 months ago
RoadRunner: Towards Automatic Data Extraction from Large Web Sites
The paper investigates techniques for extracting data from HTML sites through the use of automatically generated wrappers. To automate the wrapper generation and the data extracti...
Valter Crescenzi, Giansalvatore Mecca, Paolo Meria...
110
Voted
DEXA
2005
Springer
109views Database» more  DEXA 2005»
15 years 3 months ago
An XML Approach to Semantically Extract Data from HTML Tables
Abstract. Data intensive information is often published on the internet in the format of HTML tables. Extracting some of the information that is of users’ interest from the inter...
Jixue Liu, Zhuoyun Ao, Ho-Hyun Park, Yongfeng Chen
IPM
2007
149views more  IPM 2007»
14 years 10 months ago
Web page title extraction and its application
This paper is concerned with automatic extraction of titles from the bodies of HTML documents (web pages). Titles of HTML documents should be correctly defined in the title fields...
Yewei Xue, Yunhua Hu, Guomao Xin, Ruihua Song, Shu...
104
Voted
WEBDB
1999
Springer
196views Database» more  WEBDB 1999»
15 years 2 months ago
Web Ecology: Recycling HTML Pages as XML Documents Using W4F
In this paper we present the World-Wide Web Wrapper Factory (W4F), a Java toolkit to generate wrappers for Web data sources. Some key features of W4F are an expressive language to...
Arnaud Sahuguet, Fabien Azavant
80
Voted
ERLANG
2006
ACM
15 years 4 months ago
From HTTP to HTML: Erlang/OTP experiences in web based service applications
This paper describes the lessons learnt when internally developing web applications in Erlang. On the basis of these experiences, a framework called the Web Platform has been impl...
Francesco Cesarini, Lukas Larsson, Michal Slaski