Sciweavers

146 search results - page 1 / 30
» RoadRunner: Towards Automatic Data Extraction from Large Web...
Sort
View
VLDB
2001
ACM
144views Database» more  VLDB 2001»
13 years 9 months ago
RoadRunner: Towards Automatic Data Extraction from Large Web Sites
The paper investigates techniques for extracting data from HTML sites through the use of automatically generated wrappers. To automate the wrapper generation and the data extracti...
Valter Crescenzi, Giansalvatore Mecca, Paolo Meria...
ICWE
2007
Springer
13 years 10 months ago
Fixing Weakly Annotated Web Data Using Relational Models
In this paper, we present a fast and scalable Bayesian model for improving weakly annotated data – which is typically generated by a (semi) automated information extraction (IE) ...
Fatih Gelgi, Srinivas Vadrevu, Hasan Davulcu
VLDB
2004
ACM
121views Database» more  VLDB 2004»
13 years 10 months ago
An Automatic Data Grabber for Large Web Sites
We demonstrate a system to automatically grab data from data intensive web sites. The system first infers a model that describes at the intensional level the web site as a collec...
Valter Crescenzi, Giansalvatore Mecca, Paolo Meria...
DASFAA
2005
IEEE
123views Database» more  DASFAA 2005»
13 years 6 months ago
Automatic Data Extraction from Data-Rich Web Pages
Abstract. Extracting data from web pages using wrappers is a fundamental problem arising in a large variety of applications of vast practical interests. In this paper, we propose a...
Dongdong Hu, Xiaofeng Meng
SAC
2005
ACM
13 years 10 months ago
Pollock: automatic generation of virtual web services from web sites
As the usage of Web Services proliferates dramatically, new tools to help quickly generate web services are needed. In this paper, we propose a methodology that helps to automatic...
Yi-Hsuan Lu, Yoojin Hong, Jinesh Varia, Dongwon Le...